What are the responsibilities and job description for the Senior/Staff ML Performance Engineer, Low-Precision Training & Model quantization position at Nuro?
Who We Are
Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale—empowering a safer, richer, and more connected future.
About The Role
Nuro is seeking an experienced ML Performance Engineer with deep expertise in quantized training to join our ML Infrastructure team. In this role, you will drive the adoption of state-of-the-art quantization techniques, enabling training and deployment of highly-efficient models that power the Nuro Driver™. You will help to shape the technical strategy and partner closely with research and product groups to ensure our ML infrastructure is optimized for both cutting-edge research and real-time deployment on autonomous vehicles.
About The Work
As an ML Performance Engineer for Nuro's ML Training Infrastructure you will improve model training efficiency and drive the adoption of state-of-the-art quantization and low-precision training techniques. This will include:
At Nuro, we celebrate differences and are committed to a diverse workplace that fosters inclusion and psychological safety for all employees. Nuro is proud to be an equal opportunity employer and expressly prohibits any form of workplace discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other legally protected characteristics.
Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale—empowering a safer, richer, and more connected future.
About The Role
Nuro is seeking an experienced ML Performance Engineer with deep expertise in quantized training to join our ML Infrastructure team. In this role, you will drive the adoption of state-of-the-art quantization techniques, enabling training and deployment of highly-efficient models that power the Nuro Driver™. You will help to shape the technical strategy and partner closely with research and product groups to ensure our ML infrastructure is optimized for both cutting-edge research and real-time deployment on autonomous vehicles.
About The Work
As an ML Performance Engineer for Nuro's ML Training Infrastructure you will improve model training efficiency and drive the adoption of state-of-the-art quantization and low-precision training techniques. This will include:
- Staying ahead of emerging research and evaluating new methods.
- Implementing quantized training methods (e.g., AWQ, AQT, GPTQ) for new and existing self-driving models.
- Leading the design and implementation of efficiency initiatives for model training, including low-bit quantization, and pruning for both research and production workloads.
- Championing and implementing tools and approaches to pinpoint root-causes for possible model quality and accuracy regressions when training at lower precisions.
- Collaborating cross-functionally with research, infrastructure, and product teams balancing accuracy, latency, and resource constraints.
- 3 years of professional or research experience in ML infrastructure, distributed training, or ML systems engineering.
- Hands-on experience with quantization methods, including Activation-Aware Weight Quantization (AWQ), Accurate Quantized Training (AQT), FP-8 training, or related methods.
- Experience building or maintaining quantization libraries (e.g., AQT, bitsandbytes, NVIDIA Transformer Engine, DeepSpeed Compression).
- Understanding of calibration and scaling strategies for quantized models to minimize accuracy loss.
- Advanced degree (Ph.D. or strong M.Sc. with research experience) in Computer Science, Electrical Engineering, or related fields.
- Knowledge of sparse networks and complementary model compression techniques (e.g., AdaRound, BRECQ, structured pruning).
- Published work or open-source contributions in quantization methods (e.g., AWQ, AQT, GPTQ, SmoothQuant, ZeroQuant).
At Nuro, we celebrate differences and are committed to a diverse workplace that fosters inclusion and psychological safety for all employees. Nuro is proud to be an equal opportunity employer and expressly prohibits any form of workplace discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other legally protected characteristics.
Salary : $193,930 - $352,290