Demo

Staff MLOps Engineer - RL Infrastructure

Apptronik
Austin, TX Full Time
POSTED ON 10/19/2025 CLOSED ON 12/18/2025

What are the responsibilities and job description for the Staff MLOps Engineer - RL Infrastructure position at Apptronik?

Job Details

Apptronik is building robots for the real world to improve human quality of life and to help solve the ever-increasing labor shortage problem. Our team has been building some of the most advanced robots on the planet for years, dating back to the DARPA Robotics Challenge. We apply our expertise across the full robotics stack to some of the most important and impactful problems our society faces, and expect our products and technology to change the world for the better. We value passion, creativity, and collaboration to help us overcome existing technological barriers in the industry to create truly innovative products.

You will join a team developing state-of-the-art general-purpose robots designed to operate in human spaces and with human tools. It is designed to work alongside humans, mobilize to human spaces, and manipulate the world around it.

JOB SUMMARY

We are seeking an experienced MLOps Engineer to own and maintain our cutting-edge reinforcement learning (RL) training infrastructure. In this role, you will be responsible for the entire lifecycle of our RL systems, from managing cloud resources to optimizing job submission and deployment. You will work closely with our AI researchers to ensure they have a stable, efficient, and scalable platform to develop and train next-generation RL models.

ESSENTIAL DUTIES AND RESPONSIBILITIES or KEY ACCOUNTABILITIES
  • Design, Deploy, and Maintain Infrastructure: Manage and scale our RL training clusters on major cloud platforms (e.g., Google Cloud Platform, AWS, Azure) using infrastructure-as-code principles.
  • Orchestration and Deployment: Utilize container orchestration tools (e.g., Kubernetes, Docker Swarm) to manage the deployment and scaling of our applications and clusters.
  • Job Scheduling and Execution: Implement and manage tooling for submitting and monitoring large-scale distributed training jobs using modern distributed computing frameworks (e.g., Ray, Slurm).
  • Database and Storage Management: Oversee our cloud-native database solutions, ensuring efficient storage and retrieval of large datasets, including images.
  • Developer Tools: Create SDKs, documentation, and CLI/GUI tooling that make it easy for researchers to launch experiments, visualize results, and debug issues without infrastructure expertise.
  • System Optimization: Implement robust monitoring, logging, and alerting to ensure the reliability, performance, health and of the training infrastructure.
  • CI/CD and Automation: Develop and maintain CI/CD pipelines for automated testing, data processing, benchmarking, and model experimentation.
  • Cross-functional Collaboration: Work closely with AI researchers and robotics engineers to understand pain points, optimize training workflows, and develop solutions that accelerate development cycles.

SKILLS AND REQUIREMENTS
  • Strong software engineering fundamentals (testing, code review, documentation, git) and proven experience in a backend or infrastructure role.
  • Professional experience managing cloud infrastructure on a major cloud platform (e.g., Google Cloud Platform, AWS, Azure).
  • Hands-on experience with infrastructure-as-code tools (e.g., Terraform, Ansible, CloudFormation).
  • Familiarity with ML frameworks (PyTorch, TensorFlow) and understanding of model training workflows.
  • Proficiency with containerization and orchestration technologies (e.g., Kubernetes, Docker).
  • Understanding of distributed computing concepts and cluster management for compute-intensive workloads.
  • Solid understanding of Python and experience with scripting for automation and tooling.

Bonus Qualifications:
    • Experience with job scheduling and distributed computing frameworks like Ray, Slurm, or LSF.
    • Experience with hyperparameter tuning frameworks (e.g., Hydra) and their integration with robotics simulation platforms like IsaacLab.
  • Experience managing and optimizing cloud-native databases (e.g., Google Cloud SQL, Amazon RDS, Spanner) for large-scale data.
  • Experience in a high-performance computing (HPC) environment, especially with GPU-accelerated workloads.
  • Experience with robotics simulation tools (IsaacSim, MuJoCo, Gazebo) or game engines.
  • A strong understanding of networking and security principles within a cloud environment.

EDUCATION and/or EXPERIENCE
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
  • Minimum of 4 years of professional, full-time experience building and maintaining reliable, scalable systems.
  • Exposure to ML/data engineering infrastructure.
  • Experience building tools or platforms used by other developers and researchers.

PHYSICAL REQUIREMENTS
  • Prolonged periods of sitting at a desk and working on a computer
  • Must be able to lift 15 pounds at times
  • Vision to read printed materials and a computer screen
  • Hearing and speech to communicate

*This is a direct hire. Please, no outside Agency solicitations.

Apptronik provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Salary.com Estimation for Staff MLOps Engineer - RL Infrastructure in Austin, TX
$85,689 to $107,261
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Staff MLOps Engineer - RL Infrastructure?

Sign up to receive alerts about other jobs on the Staff MLOps Engineer - RL Infrastructure career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$73,784 - $86,677
Income Estimation: 
$90,372 - $103,622
Income Estimation: 
$61,825 - $80,560
Income Estimation: 
$90,032 - $105,965
Income Estimation: 
$85,996 - $102,718
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$158,960 - $205,707
Income Estimation: 
$154,509 - $200,187
Income Estimation: 
$71,493 - $96,419
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$137,568 - $176,908
This job has expired.
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Apptronik

  • Apptronik Austin, TX
  • Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo... more
  • 15 Days Ago

  • Apptronik Austin, TX
  • Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo... more
  • 16 Days Ago

  • Apptronik Austin, TX
  • Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo... more
  • 16 Days Ago

  • Apptronik Austin, TX
  • Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo... more
  • 17 Days Ago


Not the job you're looking for? Here are some other Staff MLOps Engineer - RL Infrastructure jobs in the Austin, TX area that may be a better fit.

  • Motion Recruitment Partners, LLC Remote, TX
  • About the Role Our client is focused on helping people lead healthier lives by providing tools, resources, and support to reach their health goals. They ar... more
  • 5 Days Ago

  • CharterUP Austin, TX
  • About CharterUP. CharterUP is transforming the $450 billion group transportation and mobility market with an AI native platform that powers modern charter,... more
  • 9 Days Ago

AI Assistant is available now!

Feel free to start your new journey!