Demo

Senior Software Engineer, ML Infrastructure

Decagon
San Francisco, CA Full Time
POSTED ON 4/18/2026
AVAILABLE BEFORE 5/26/2026
About Decagon

Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.

Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.

We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.

We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.

About The Team

The ML Infrastructure team builds the systems that power every stage of Decagon's model lifecycle. We own the platforms for model training, the infrastructure for model evaluation and experimentation, and the routing layer that manages inference across multiple providers.

We work at the intersection of research and production: translating cutting-edge ML models into reliable, scalable systems that run in customer environments. We collaborate closely with Research, Infrastructure, and Product teams to ensure models train efficiently, serve reliably, and deliver exceptional user experiences.

The team values technical rigor, pragmatic decision-making, and building systems that others love to use.

About The Role

We're hiring a Senior ML Infrastructure Engineer to own the platforms powering Decagon's model training and inference. You'll build distributed training systems, design inference architecture across multiple providers, and create the frameworks that let our Research and Product teams ship faster.

This role is for someone who thrives on technical depth, can lead multi-quarter initiatives, and wants to shape the long-term architecture of our ML stack.

In this role, you will

  • Design and build distributed training platforms for LLM and multimodal fine-tuning and post-training at scale
  • Integrate state-of-the-art training algorithms into production pipelines
  • Own inference architecture and multi-provider routing, including failover and optimization
  • Lead initiatives to improve latency and cost efficiency across the training and serving stack
  • Build evaluation and experimentation infrastructure that enables rapid, reliable iteration
  • Drive technical direction, mentor engineers, and establish best practices for ML infrastructure

Your background looks something like this

  • 6 years building ML infrastructure or production systems at scale
  • Deep experience with distributed training: multi-node GPU clusters, fault tolerance, and optimization
  • Strong understanding of LLM inference: latency optimization, provider tradeoffs, and serving architecture
  • Proven track record leading complex, multi-quarter technical projects

Benefits

  • Medical, dental, and vision benefits
  • Take what you need vacation policy
  • Daily lunches, dinners and snacks in the office to keep you at your best

Compensation

$250K – $330K Offers Equity

Salary : $250,000 - $330,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Software Engineer, ML Infrastructure?

Sign up to receive alerts about other jobs on the Senior Software Engineer, ML Infrastructure career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$146,673 - $180,130
Income Estimation: 
$115,390 - $147,559
Income Estimation: 
$136,671 - $177,110
Income Estimation: 
$128,093 - $158,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Decagon

  • Decagon San Francisco, CA
  • About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experience. Our AI agents provide inte... more
  • 9 Days Ago

  • Decagon San Francisco, CA
  • About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology enables in... more
  • 9 Days Ago

  • Decagon San Francisco, CA
  • About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experience. Our AI agents provide inte... more
  • 9 Days Ago

  • Decagon San Francisco, CA
  • About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology enables in... more
  • 9 Days Ago


Not the job you're looking for? Here are some other Senior Software Engineer, ML Infrastructure jobs in the San Francisco, CA area that may be a better fit.

  • Voxel San Francisco, CA
  • Who Are We Industrial labor is incredibly dangerous work - almost 3 million people in the US per year are injured in the workplace for entirely preventable... more
  • 3 Days Ago

  • Waymo San Francisco, CA
  • Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Pr... more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!