Demo

Member of Technical Staff, AI Engineering

Mandolin
San Francisco, CA Full Time
POSTED ON 4/8/2026
AVAILABLE BEFORE 5/6/2026
About Mandolin

Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreaking treatments to patients faster, powered by AI agents.

Mandolin partners closely with the largest healthcare institutions in the US, covering more than $10B drug spend across the country. We're backed by Greylock, SV Angel, Maverick, SignalFire, and the founders of Vercel, Decagon, and Yahoo.

Why we need you

Our copilots handle ~80% of clinic back-office workload today. Reaching 99% means capturing the long-tail edge cases that bury teams in rework, pushing the latest foundation models to their accuracy and cost limits, and proving every improvement with airtight, regulator-ready evaluation.

We need an AI Engineer who has already built this bridge from "impressive demo" to "lights-out production." You will own the systems that serve, monitor, and improve our models in the field—turning Mandolin from a helpful copilot into a true autopilot where work closes itself and clinicians only handle the exceptions.

What You’ll Do

  • Model serving & inference. Deploy and operate LLMs and VLMs for real-time inference using vLLM, SGLang, or equivalent runtimes. Tune KV caching, batching strategies, and speculative decoding to hit latency and cost targets.
  • ML pipeline ownership. Build and maintain HIPAA-compliant pipelines—data capture → training runs → inference → human-in-the-loop feedback—end to end.
  • Evaluation & telemetry. Design evaluation harnesses and telemetry systems that surface model degradation, edge-case failures, and business impact before and after every deploy. Use customer and model telemetry to close the feedback loop continuously.
  • Performance debugging. Diagnose and resolve ML workflow bottlenecks—identifying whether issues are IO-bound, memory-bound, or compute-bound across GPU, CPU, and serverless footprints.
  • Infrastructure & reliability. Work with distributed systems (Kubernetes, queues, workers, load balancing) to keep inference services fault-tolerant, scalable, and observable.
  • Model strategy. Select and integrate SOTA models for vision, language, document parsing, and OCR. Apply fine-tuning, RAG, or quantization where ROI justifies it.
  • Product integration. Pair with forward-deployed engineers to turn field discoveries into new datasets, metrics, and rapid model iterations.

Must-have Experience

  • Production experience deploying and serving LLMs or VLMs—familiar with inference runtimes (vLLM, SGLang, or similar), KV caching, and speculative decoding.
  • White-box understanding of transformer-based models: tokenization (image and text), autoregressive generation, temperature scaling, and sampling techniques.
  • Hands-on experience with document parsing, OCR models, or structured data extraction from unstructured inputs.
  • Ability to debug ML system bottlenecks and reason clearly about IO vs. memory vs. compute tradeoffs.
  • Experience with distributed systems fundamentals—Kubernetes, message queues, workers, load balancing—sufficient to own a production inference stack.
  • Track record building telemetry and evaluation frameworks: using real customer data to measure model performance and using model-level signals to debug edge cases.
  • Proficiency in Python and comfort working across the ML stack, from data pipelines to serving infrastructure.

Nice-to-haves

  • Healthcare, claims processing, or complex form-extraction experience.
  • Familiarity with fine-tuning techniques (LoRA/PEFT) or retrieval-augmented generation (RAG).
  • Experience on cloud ML stacks—Vertex AI, AWS SageMaker, or Kubernetes-native ML workflows.
  • Open-source contributions, peer-reviewed research, or public technical writing

Salary.com Estimation for Member of Technical Staff, AI Engineering in San Francisco, CA
$153,227 to $196,880
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Member of Technical Staff, AI Engineering?

Sign up to receive alerts about other jobs on the Member of Technical Staff, AI Engineering career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$36,436 - $44,219
Income Estimation: 
$50,145 - $86,059
Income Estimation: 
$48,515 - $60,705
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Mandolin

  • Mandolin San Francisco, CA
  • About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreak... more
  • 9 Days Ago

  • Mandolin San Francisco, CA
  • About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreak... more
  • 10 Days Ago

  • Mandolin San Francisco, CA
  • About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreak... more
  • 10 Days Ago

  • Mandolin San Francisco, CA
  • About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreak... more
  • 10 Days Ago


Not the job you're looking for? Here are some other Member of Technical Staff, AI Engineering jobs in the San Francisco, CA area that may be a better fit.

  • Fireworks AI San Mateo, CA
  • About Us At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and mos... more
  • 15 Days Ago

  • Reflection AI San Francisco, CA
  • Our Mission Reflection’s mission is to build open superintelligence and make it accessible to all . We’re developing open weight models for individuals, ag... more
  • 9 Days Ago

AI Assistant is available now!

Feel free to start your new journey!