Demo

AI Infrastructure Engineer

Stealth Post-LLM Startup
Los Altos, CA Full Time
POSTED ON 4/21/2026
AVAILABLE BEFORE 10/17/2026
The Mission

Dyssonance is reimagining how AI thinks. We are building an AI capable of memory, dynamic reasoning, and evolving beliefs. Founded by leaders from DeepMind and Google, we are a small, elite team solving the foundational challenges of cognition.


Product engineers need to move as fast as the model changes. Researchers need to test ten ideas a day instead of a few a week. We are building the infrastructure that makes both possible.


The Role


We need an engineer who builds the substrate that our product engineers and our researchers both stand on—the pipelines, sandboxes, and agent infrastructure that turn intent into shipped code and shipped experiments.


As our AI Infrastructure Engineer, you will own the systems that automate development and research. On the product side, that's the dev loops, CI, and agent-assisted pipelines that let us ship infra & products to users. On the research side, that's the training substrate, eval harnesses, and experiment orchestration that keep our researchers in flow. The same primitives serve both, and you will design them that way.


What You Will Build


  • Research Infra: High-throughput orchestration for training runs, evals, and ablations on GPU fleets. Artifact tracking, deterministic replay, and cost attribution—so a researcher can launch a sweep in one command and trust the numbers that come back.
  • Product Development Infra: The CI, deploy, and preview-environment pipelines that let product engineers ship dozens of times a day. Agent-assisted code review, auto-generated tests, and the plumbing that lets coding agents open PRs against our repo safely.
  • Sandboxed Execution: Isolated, reproducible environments where agent-generated code can compile, run, and be evaluated at scale without torching the host or the budget. The same sandbox serves product agents and research agents.
  • The Agent Control Plane: APIs, queues, and observability for running thousands of concurrent agents—whether they're fixing bugs in the product, running experiments on the model, or somewhere in between. Traces, interventions, and replay for every step.
  • The Dev Substrate: Internal tooling that binds it together—secrets, datasets, cost dashboards, experiment registries. The command center for a lab that ships a product.


The Stack


  • Orchestration: Python, Kubernetes.
  • Agents & Models: Anthropic SDK, OpenAI SDK, vLLM, in-house checkpoints.
  • Data & State: Postgres (with vector extensions), Redis, object storage for artifacts.
  • Infra: AWS/GCP, Docker, Firecracker / gVisor for sandboxing, Terraform.
  • Observability: OpenTelemetry, structured traces for every agent step.


Who You Are


  • An Infra Native: You have built systems that run thousands of jobs a day without a human in the loop, and you know the difference between something that looks autonomous in a demo and something that stays up on a Saturday.
  • A Force Multiplier for Both Sides: You are as comfortable unblocking a product engineer who needs a faster preview environment as a researcher who needs a cleaner ablation pipeline. You don't pick sides between "ship the product" and "do the research"—your infra serves both.
  • Production-Grade: 3 years of engineering experience, with a track record of shipping systems that don't silently corrupt state. Clean interfaces, instrumentation, and correctness under concurrency are reflexes, not checklist items.
  • Agentic by Default: You already use coding agents in your daily workflow. You have a point of view on where they fail, and you want to fix it at the infra layer. You can look at projects like autoagent or autoresearch and immediately see what it would take to make those primitives production-grade for a frontier lab.
  • High Agency: You spot a bottleneck, you fix it. You don't wait for a ticket, and you don't wait for permission to delete the thing that isn't working.


Bonus Points
  • Built developer platforms, CI systems, or deploy pipelines at a company that shipped fast.
  • Built agent frameworks, eval harnesses, or experiment trackers (MLFlow, Weights & Biases, LangGraph, Inspect, etc.).
  • Experience running large training or inference workloads on GPU clusters.
  • Comfort with sandboxing (Firecracker, gVisor, nsjail) and the security model of running untrusted code at scale.
  • Open-source contributions.


Requirements
  • 3 years of engineering experience.
  • Willing to commute to Los Altos

Salary.com Estimation for AI Infrastructure Engineer in Los Altos, CA
$140,151 to $180,532
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI Infrastructure Engineer?

Sign up to receive alerts about other jobs on the AI Infrastructure Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Not the job you're looking for? Here are some other AI Infrastructure Engineer jobs in the Los Altos, CA area that may be a better fit.

  • Abaka AI Palo Alto, CA
  • About Abaka AI Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Gene... more
  • 24 Days Ago

  • Coram AI Sunnyvale, CA
  • At Coram AI, we’re reimagining video security for the modern world. Our cloud-native platform uses computer vision and AI to help businesses stay safe, mak... more
  • 19 Days Ago

AI Assistant is available now!

Feel free to start your new journey!