Demo

Founding Machine learning Engineer - Evaluation

Established Search
Sunnyvale, CA Full Time
POSTED ON 6/10/2026
AVAILABLE BEFORE 8/9/2026

Senior ML Engineer Medical Imaging Evaluation & AI Reliability


About the Role:


My client is building evaluation and evidence infrastructure for safety-critical AI systems, starting with diagnostic medical imaging.


AI systems are increasingly used in settings where their outputs affect clinical decisions and patient outcomes. In medical imaging, benchmark accuracy alone is not enough. Hospitals, regulators, and clinical stakeholders need evidence that models will behave reliably across real-world deployment environments, populations, scanners, and workflows.


This role sits at the intersection of:


  • medical imaging AI,
  • model robustness and evaluation,
  • regulatory evidence generation,
  • and real-world deployment behavior.


The work is highly investigative and requires strong technical judgment, scientific reasoning, and the ability to operate effectively in ambiguous environments.


The Role


This is not a traditional “train models on benchmark datasets” ML role.

You will work directly with medical imaging companies and healthcare stakeholders to investigate how AI systems behave in practice and what evidence is required for deployment, regulatory, and clinical decisions.


You will:


  • Design and execute evaluations for medical imaging AI systems
  • Investigate model failure modes, robustness, and generalization gaps
  • Analyze behavior across populations, scanners, imaging protocols, and clinical settings
  • Determine what evidence is sufficient for stakeholders making deployment or regulatory decisions
  • Translate technical findings into actionable recommendations for customers and clinical stakeholders
  • Build reusable evaluation pipelines, evidence schemas, and model assessment frameworks
  • Work with messy, incomplete, and noisy real-world clinical data
  • Help shape how evaluation investigations are conducted across the organization


The important work is not simply running experiments. It is identifying what questions actually matter, what evidence is missing, and how to generate defensible conclusions under real-world constraints.


Required Qualifications:


  • Strong experience in machine learning for medical imaging (radiology, pathology, cardiology imaging, or related domains)
  • Experience evaluating or validating real-world ML systems, not just training models
  • Deep understanding of:
  • model robustness,
  • distribution shift,
  • uncertainty,
  • failure analysis,
  • and real-world deployment behavior
  • Strong Python skills across the full investigation workflow:
  • data analysis,
  • experimentation,
  • evaluation,
  • and reporting
  • Experience working with noisy or imperfect clinical datasets
  • Ability to communicate technical findings clearly to both technical and non-technical stakeholders
  • High tolerance for ambiguity and open-ended investigative work


Strongly Preferred:


  • Experience with FDA-regulated AI/ML systems or medical device submissions (510(k), De Novo, SaMD, etc.)
  • Experience with medical imaging deployment evaluation or clinical validation
  • Experience with interpretability, post-deployment monitoring, uncertainty estimation, or model auditing
  • Experience designing reproducible evaluation frameworks or benchmarking systems
  • Background in healthcare AI or other safety-critical ML domains
  • Customer-facing or cross-functional technical leadership experience
  • PhD or equivalent research depth in ML, medical imaging, computer vision, or related areas

Ideal Candidate Profile


Candidates who tend to succeed in this role often come from backgrounds such as:


  • Medical imaging ML research
  • FDA or healthcare AI evaluation
  • Clinical AI validation
  • AI robustness and reliability research
  • Applied ML investigation in safety-critical environments
  • Healthcare-focused computer vision research


What Success Looks Like:


The strongest people in this role become experts in how medical AI systems behave in the real world.

They develop the judgment to answer questions such as:

  • Where are the model’s true weaknesses?
  • Which deployment conditions introduce risk?
  • What concerns are real versus theoretical?
  • What evidence is sufficient for a hospital or regulator to trust the system?
  • What additional validation is required before deployment proceeds?

Salary.com Estimation for Founding Machine learning Engineer - Evaluation in Sunnyvale, CA
$98,330 to $128,694
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Founding Machine learning Engineer - Evaluation?

Sign up to receive alerts about other jobs on the Founding Machine learning Engineer - Evaluation career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$74,333 - $101,518
Income Estimation: 
$85,784 - $114,624
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Established Search

  • Established Search Chicago, IL
  • Area Sales Manager – Surgical Robotics Territory: Midwest USA (Chicago) We are partnering with a high-growth medical technology company to support the expa... more
  • 3 Days Ago


Not the job you're looking for? Here are some other Founding Machine learning Engineer - Evaluation jobs in the Sunnyvale, CA area that may be a better fit.

  • Greylock Partners Redwood, CA
  • Early-stage, cybersecurity investment (valued over $100M at Seed), founded by a successful serial entrepreneur, is looking to hire a Founding MLE with a st... more
  • 13 Days Ago

  • HackerRank Santa Clara, CA
  • HackerRank helps companies like NVIDIA, Amazon, and Microsoft hire and upskill the next generation of developers based on skills, not pedigree. Our platfor... more
  • 17 Days Ago

AI Assistant is available now!

Feel free to start your new journey!