4 Senior/Staff DevOps HPC Engineer Jobs in Salt Lake, UT

SORT BY

SET JOB ALERT

OFF

Details...

Senior/Staff DevOps HPC Engineer

Recursion

Salt Lake, UT | Full Time

$107k-134k (estimate)

6 Months Ago

Sr. DevOps Engineer

Dyno Nobel

Salt Lake, UT | Full Time

$104k-131k (estimate)

2 Months Ago

Senior DevOps Engineer

Consensus

Salt Lake, UT | Full Time

$104k-131k (estimate)

2 Months Ago

Mid-Level / Senior DevOps Engineer - Remote - USA

FullStack Labs

Salt Lake, UT | Full Time

$86k-110k (estimate)

7 Months Ago

Senior/Staff DevOps HPC Engineer

Recursion Salt Lake, UT

$107k-134k (estimate)

Full Time | Durable Manufacturing 6 Months Ago

Save

Recursion is Hiring a Senior/Staff DevOps HPC Engineer Near Salt Lake, UT

The Impact You’ll Make

Recursion is revolutionizing the field of drug discovery by integrating Science and Machine Learning, and we are looking for a Senior/Staff DevOps HPC Engineer to join our pioneering team.

You will play a crucial role in developing and maintaining our HPC systems that power our cutting-edge drug discovery research. You will be responsible for designing, implementing, and managing the infrastructure that supports our machine learning and scientific computing workloads.

Your day-to-day tasks will include building robust and scalable infrastructure, deploying and managing HPC resources, and automating operational processes. You'll apply your deep understanding of DevOps principles and HPC systems to solve complex computational challenges. This means you'll be actively involved in executing high-level computational strategies, tracking crucial processing information, and ensuring high data integrity.

Furthermore, you will collaborate with a diverse team of scientists, machine learning experts, and other engineers to develop a world-class data platform that facilitates the generation and management of petabytes of data, enabling the rapid deployment of new deep learning models into the production data pipeline.

Your contributions will directly impact the efficiency and effectiveness of our drug discovery efforts. You can expect to work on multiple projects at the same time in a fast-paced and stimulating environment.

Your responsibilities will not just be limited to maintaining systems and infrastructure, but will also include proactive troubleshooting, routine system maintenance, ensuring the security of our computing environment, and creating detailed documentation for all processes and procedures. Join us, and make a significant impact on the future of drug discovery. In this role:

You’ll design, implement, maintain and optimize our Scientific compute, network, and data storage infrastructure and services using an Infrastructure as Code approach across both on-premises and public cloud environments.
Your technical expertise and leadership will drive innovation across all layers of the HPC/AI infrastructure, ensuring that we provide an effective, scalable platform to support our dynamic scientific workloads.
Through developing scripts and workflows, you'll automate and verify infrastructure provisioning and dynamic reconfiguration, various repetitive tasks, enhancing our support of the HPC environments. Your attention to detail will be critical in performance analysis, benchmarking, and tuning of our systems and applications.
Your troubleshooting skills will be invaluable as you resolve application, system, and other technical problems, alongside addressing user tickets swiftly.
Your role involves researching, deploying, and optimizing workloads and resource scheduling, security, and data lifecycle management policies.
You will be involved in regularly assessing the health and operational performance of the platform against established metrics, with a view to achieving and improving operational service metrics and targets associated with the platform.
Lastly, as a lead in technical communication and collaboration with our customers, your efforts will ensure a high level of customer satisfaction. It's your opportunity to make a significant impact in our organization and the wider scientific community.

Location:

This position is based at our headquarters in Salt Lake City, Utah or our office in Toronto, Canada, however, we will consider remote work for this position. We ask that remote employees commit to regular on-site visits for routine work and departmental events.

The Team You’ll Join

As a Senior/Staff DevOps HPC Engineer, you will be a part of our dedicated HPC Engineering team, reporting directly to the Associate Director. This dynamic team includes two experienced Senior Engineers, and with the addition of two new roles, including this position, you'll be part of an empowered, cross-functional unit.

Our HPC team works in a fast-paced, collaborative environment, handling a broad spectrum of computational projects. These range from developing advanced, scalable infrastructure to deploying and managing HPC resources and automating operational processes. The team also plays a crucial role in the curation of our vast data platform, which caters to a diverse set of professionals, including biologists, data scientists, and automation engineers.

The HPC team is constantly pushing the boundaries in the field of supercomputing in the TechBio industry. As part of this team, you will collaborate on projects that streamline and optimize our machine learning workflows and scientific computing tasks, driving efficient and transformative solutions within the company. This is a unique opportunity to join a team that thrives on innovation, collaboration, and inclusivity in a role that is pivotal to our mission.

The Experience You’ll Need

A minimum of 10 years of experience in dealing with HPC infrastructure, preferably in global BioPharma organizations.
Solid experience with software-defined Infrastructure and cloud computing platforms such as Kubernetes, GCP, AWS, and other.
Extensive experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments.
In-depth hands-on experience with the provisioning, configuration, and management of infrastructure through Infrastructure as Code (IaC) and cloud automation principles.
Python programming and bash scripting experience.
Proficiency with source control, continuous integration, configuration management, monitoring, and systems tools.
Practical knowledge of resource management and job scheduling using Slurm and Kubernetes.
Experience with RDMA-capable high-speed networking.
Familiarity with parallel file systems and multi-tier file and object storage.
Proficiency in container technology including Apptainer and docker.
Experience in building, installing, and supporting user-requested software.
Strong verbal and written skills for effective communication and documentation.
Prior experience mentoring, guiding, and cross-training team members.

How You’ll be Supported

The Onboarding process will include peer knowledge transfer sessions, introductions to key stakeholders, and comprehensive exposure to our company culture and processes.
You'll have the chance to learn from your colleagues during our regular lunch & learn and tech talk sessions.
We offer the opportunity to attend courses for certification in new skills or technologies relevant to your role.
If you're keen to hone your leadership skills, you'll have the option to participate in our coaching sessions like BetterUp.
To ensure you're always at the forefront of your field, we offer the opportunity to attend conferences.

#LI-EP1

At Recursion, we believe that every employee should be compensated fairly. Based on the skill and level of experience required for this role, the estimated current annual base range for this role is:

Developing: $160,000
Skilled: $169,000
Expert: $182,000

To learn more about our level within levels, click here.

You will also be eligible for bonuses and equity compensation our comprehensive benefits package for United States based candidates. The range displayed on each job posting reflects target ranges for US new hire salaries and is determined by job, level, and market factors.

During the interview selection process, you will connect with a Talent Acquisition Partner who will be your advocate and ally to ensure you receive the appropriate compensation that meets your needs for your skills, experience, and relevant education/training, while also reviewing our very competitive total rewards package.

Job Summary

JOB TYPE

Full Time

INDUSTRY

Durable Manufacturing

SALARY

$107k-134k (estimate)

POST DATE

11/11/2023

EXPIRATION DATE

07/02/2024

WEBSITE

recursionpharma.com

HEADQUARTERS

SALT LAKE CITY, UT

SIZE

200 - 500

FOUNDED

2013

TYPE

Public

CEO

CHRISTOPHER GIBSON

REVENUE

$10M - $50M

INDUSTRY

Durable Manufacturing

Related Companies

About Recursion

Recursion is a Utah-based biotechnology company that discovers and commercializes drugs for the treatment of genetic, inflammation and infectious diseases.