You haven't searched anything yet.
Network Operations Engineer (InfiniBand)
EOS
Fremont, CA
WHO WE ARE:
EOS IT Management Solutions Inc is a leading Global IT and Video Collaboration company. We specialize in innovative IT and video conferencing solutions, which empower businesses and organizations throughout the world. We have an immediate opening for a highly motivated, collaborative and committed individual to fill our Network Operations Engineer position with InfiniBand experience.
THE POSITION:
The successful candidate will be enthusiastic and passionate about IT and Networking. The role specifically involves working as a Network Operations Engineer as a first/second level of contact, supporting InfiniBand fabrics for high performance compute clusters for a R&D group.
Hours: 5 days a week Mon-Fri 9:00am to 6:00pm. - On-call will be required as needed.
WHAT YOU'LL DO:
· Maintain and support InfiniBand fabrics infrastructure that will include routers, switches, servers, network operating system software, network management software and other related hardware.
· Will provide day-to-day operations of the Linux HPC clusters and network support in the areas of I/O connectivity, IP over InfiniBand
· Proactively monitor, analyze and correct system issues.
· Develop scripts to automate repetitive tasks or tools to enhance support of HPC systems System performance analysis and tuning Build, install and support user requested software that includes upgrades, patch fixes Support HPC technology evaluations and assessments.
· Communicate with end users via tasking tools, group chats and phone Multitasking various user issue's effectively and efficiently, while documenting trouble shooting and triage steps.
· Escalating tasks to vendor while documenting summary of problem and troubleshooting steps that have been taken.
· Perform queue management for user tasks, alarms, alert tasks, incidents and troubleshoot/and or triage as necessary.
· Document run-books and procedures to assist in trouble shooting and completion of tasks.
· Must have excellent customer service skills and the ability to deal with end users / management during times of pressure.
WHAT YOU NEED TO SUCCEED:
· The ideal candidate should have minimum 3 years' experience with InfiniBand.
· Should have strong background in maintaining (and building) InfiniBand fabrics for high performance compute clusters.
· Experience with HDR (200Gbps) and SHARPv2 (Scalable Hierarchical Aggregation and Reduction Protocol).
· Knowledge of key I/O technologies such as Smart NIC's, 200GigE, RNIC's, Infiniband, Fibre Channel, SAS.
· Experience with routing engines like OpenSM is mandatory.
-Understanding of the internals of a Router/Switch hardware, NPU/data planes and Optics.
· Understanding of the design principles and troubleshooting of distributed systems Solid understanding of high-performance computing, IB fabrics and operational best practices.
· Demonstrated knowledge of different routing algorithms including UPDN, LASH, DOR etc. Working experience with Mellanox vendor to troubleshoot issues.
· Demonstrated ability to analyze complex situations and utilize troubleshooting skills, systems and tools, and creative problem-solving abilities under pressure Proficiency in Linux operating system, scripting experience a plus.
· Ability to work with system configuration management tools (e.g., Puppet, Ansible) and revision control software such as Git Experience with scripting and programming languages such as Bash Shell, Python, etc. Ability to work in fast-paced and dynamic environments with limited supervision Strong attention to detail with excellent time management and organization skills.
· Team player, excellent written and verbal communication.
· Self-motivated, strong analytical thinker who enjoys problem solving.
· Capable of working/using own initiative with minimal supervision.
EDUCATION:
· Associates or Bachelor's degree in computer science or related discipline
· Certification such as InfiniBand Professional a plus.
· Experience with High Performance Computing or Linux
Additional Information
The EOS Group recognizes the responsibilities it has to its customers, suppliers & employees.
At EOS, so far as is reasonably practicable, it is our responsibility to ensure the health, safety, and welfare of our employees at work as well as taking all reasonable steps to ensure that anyone affected by our business, including visitors and service users, are not exposed to risks.
That’s why all offers of employment are contingent on the candidate showing proof of being fully vaccinated against Covid 19 to pass the pre-employment requirements.
Individuals with medical issues or religious beliefs or practices that prevent them from getting the vaccine may request an exemption from the vaccine requirement.
EOS is committed to creating a diverse and inclusive work environment and is proud to be an equal opportunity employer. We invite you to consider opportunities at EOS regardless of your gender; gender identity; gender reassignment; age; religious or similar philosophical belief; race; national origin; political opinion; sexual orientation; disability; marital or civil partnership status or other non-merit factor.
Full Time
Business Services
10/06/2022
10/17/2022
TEANECK, NJ
50 - 100
2018
MARK USUI
$10M - $50M
Business Services
The following is the career advancement route for Network Engineer (Wireless) positions, which can be used as a reference in future career path planning. As a Network Engineer (Wireless), it can be promoted into senior positions as a Network Engineer IV that are expected to handle more key tasks, people in this role will get a higher salary paid than an ordinary Network Engineer (Wireless). You can explore the career advancement for a Network Engineer (Wireless) below and select your interested title to get hiring information.