Demo

Staff / Principal Machine Learning Engineer, Serving

Inworld AI
Mountain View, CA Full Time
POSTED ON 4/9/2026
AVAILABLE BEFORE 5/7/2026

About Inworld


Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second.


We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.


Who We're Looking For


A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we're not screening for a resume template — we're looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they've built, broken, and understood.


Experience We Find Useful


You don't need all of this. But you need enough to make a case.


  • Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
  • Model Acceleration. Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
  • High-Performance Systems. Proficiency in C , CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs.
  • Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
  • Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
  • Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.
  • Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.


Who Thrives Here


  • You don’t need a roadmap to start walking; you’re comfortable picking a direction and building the map as you go.
  • You believe engineering isn't finished until it’s shipped and stable. You have a bias for impact over purely theoretical optimizations.
  • You don't just ship code; you obsess over the why. You’re the first to question an architecture if you think there’s a better way to solve the core latency or throughput problem.
  • You aren't satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make.


What Working Here Is Like


We hand you unclear problems and expect you to make them clear. We value engineers who say "I don't know yet" and then design the benchmark or prototype that finds out. We treat performance, latency, and reliability as first-class product features, not a box to check before launch. Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward. Your work should be visible. Flat structure, fast iterations, minimal process theater.


We believe in the power of in-person collaboration to solve the hardest problems and foster a strong team culture. We offer relocation assistance and look forward to you joining us in our Mountain View office.


The base salary range for this full-time position is $270,000 - $500,000 bonus equity benefits.

Salary.com Estimation for Staff / Principal Machine Learning Engineer, Serving in Mountain View, CA
$120,934 to $155,263
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Staff / Principal Machine Learning Engineer, Serving?

Sign up to receive alerts about other jobs on the Staff / Principal Machine Learning Engineer, Serving career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$77,900 - $95,589
Income Estimation: 
$101,387 - $124,118
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Inworld AI

  • Inworld AI Mountain View, CA
  • Why Join Inworld Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and th... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Staff / Principal Machine Learning Engineer, Serving jobs in the Mountain View, CA area that may be a better fit.

  • ServiceNow and Careers Santa Clara, CA
  • Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work... more
  • 9 Days Ago

  • Microsoft AI Mountain View, CA
  • Overview As a Principal Machine Learning Engineer, you will work on the Data Labeling and classification on large scale multi modal Copilot data part of th... more
  • 9 Days Ago

AI Assistant is available now!

Feel free to start your new journey!