What are the responsibilities and job description for the ML Infrastructure Engineer - RL Systems position at Differential.?
ML Infrastructure Engineer (RL Systems)
Move from maintaining a piece of the machine to defining the infrastructure that powers frontier AI research.
We're working with a frontier AI company building the infrastructure that powers large-scale reinforcement learning research.
This role sits at the intersection of ML systems, distributed infrastructure and researcher enablement. You'll work closely with research teams to build the tooling, platforms and infrastructure required to train, evaluate and scale advanced AI models.
Why This Role?
- Direct ownership of critical infrastructure rather than a small component within a much larger organisation.
- Work closely with researchers and influence how experiments are run, scaled and evaluated.
- Help shape the next generation of RL systems, tooling and infrastructure.
- Fast feedback loops and visible impact on research velocity and model performance.
- Solve genuinely hard distributed systems and ML infrastructure challenges at the frontier of AI.
Ideal Background
- ML Infrastructure / ML Systems Engineering
- Distributed Systems
- Training or Inference Infrastructure
- RL or Post-Training Systems
- Researcher Tooling
- Large-scale GPU workloads
Technologies
PyTorch, DeepSpeed, FSDP, Ray, vLLM, CUDA, Triton, Kubernetes, distributed training and serving systems.
Salary : $350,000 - $500,000