What are the responsibilities and job description for the Linux system administration - HPC environments position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, ITBrainiac Inc, is seeking the following. Apply via Dice today!
Hybrid role Urgent client needs Linux system administration - HPC environments
Linux system administration - HPC environments
Kalamazoo MI 49007 (Hybrid)
12 Months
Mandatory Skill set:
Experience in Linux system administration, preferably in HPC environments.
Strong expertise with Slurm workload manager.
Proficiency in Bash, Python, or other scripting languages.
Familiarity with parallel file systems and high-speed networking (e.g., InfiniBand).
Experience with configuration management tools (e.g., Ansible, Puppet).
Detailed Job Description
Zoetis is seeking a skilled HPC Slurm Administrator to manage and support high-performance computing (HPC) environments. The ideal candidate will have hands-on experience with Slurm workload manager and Linux system administration, and will play a key role in maintaining, optimizing, and scaling HPC infrastructure.
Key Responsibilities:
Administer and maintain HPC clusters using Slurm.
Monitor system performance and ensure high availability and reliability.
Troubleshoot and resolve issues related to job scheduling, compute nodes, and storage.
Manage user accounts, permissions, and security policies.
Automate administrative tasks using scripting languages (e.g., Bash, Python).
Collaborate with engineering and research teams to support compute-intensive workloads.
Document system configurations, procedures, and operational changes.
Participate in upgrades, patching, and scaling of HPC infrastructure.
Hybrid role Urgent client needs Linux system administration - HPC environments
Linux system administration - HPC environments
Kalamazoo MI 49007 (Hybrid)
12 Months
Mandatory Skill set:
Experience in Linux system administration, preferably in HPC environments.
Strong expertise with Slurm workload manager.
Proficiency in Bash, Python, or other scripting languages.
Familiarity with parallel file systems and high-speed networking (e.g., InfiniBand).
Experience with configuration management tools (e.g., Ansible, Puppet).
Detailed Job Description
Zoetis is seeking a skilled HPC Slurm Administrator to manage and support high-performance computing (HPC) environments. The ideal candidate will have hands-on experience with Slurm workload manager and Linux system administration, and will play a key role in maintaining, optimizing, and scaling HPC infrastructure.
Key Responsibilities:
Administer and maintain HPC clusters using Slurm.
Monitor system performance and ensure high availability and reliability.
Troubleshoot and resolve issues related to job scheduling, compute nodes, and storage.
Manage user accounts, permissions, and security policies.
Automate administrative tasks using scripting languages (e.g., Bash, Python).
Collaborate with engineering and research teams to support compute-intensive workloads.
Document system configurations, procedures, and operational changes.
Participate in upgrades, patching, and scaling of HPC infrastructure.