Demo

Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco

ChatGPT Jobs
San Francisco, CA Full Time
POSTED ON 5/13/2026
AVAILABLE BEFORE 6/11/2026
Job Description

Job Description - Plaud Inc.

Speech Evaluation Engineer (Speech LLM)

Company: Plaud Inc.

Location: San Francisco, CA

Type: On-site (Hybrid: Minimum 3x in-office per week)

Machine Learning & Artificial Intelligence

Job Overview

Plaud is seeking a candidate to turn ambiguous concepts like voice naturalness and cadence into clear, automated metrics. You will partner with ML researchers to define benchmarks for Speech LLMs, build scalable data pipelines, and own dashboards that track model health and performance.

Key Responsibilities

  • Define and automate metrics for subjective concepts such as naturalness, expressiveness, and conversational cadence.
  • Build reliable distributed systems and data pipelines that run at scale against live model checkpoints.
  • Partner with ML researchers to translate Speech LLM capabilities (e.g., ASR robustness, TTS emotional steerability) into measurable benchmarks.
  • Develop and own dashboards to track model health during training, improve signal-to-noise ratios, and reduce evaluation latency.
  • Debug anomalous mid-training results to identify root causes (architecture, data, or infrastructure).
  • Communicate complex statistical results and model behaviors to technical and non-technical stakeholders.

Required Qualifications

  • Engineering Skills: Strong software engineering skills, particularly in Python, with experience in distributed systems and evaluation harnesses.
  • ML Collaboration: Ability to deeply partner with researchers to define "good" performance for AI models.
  • Observability: Experience building trusted tracking dashboards (e.g., Weights & Biases, MLflow).
  • Communication: Ability to clearly articulate complex statistical results.

Preferred Qualifications

  • Speech Metrics: Familiarity with WER, CER, PESQ, and automated MOS scoring frameworks.
  • LLM-as-a-Judge: Experience using frontier or fine-tuned multi-modal LLMs to evaluate conversational logic, transcription accuracy, and audio quality.
  • Human Evaluation: Background in managing large-scale crowdsourcing for RLHF/DPO efforts.
  • Adversarial Datasets: Experience curating datasets to test edge cases (heavy accents, overlapping speech, noisy environments).

Compensation & Benefits

  • Salary: $180,000 - $270,000 base salary performance bonus Equity.
  • Healthcare: Top-tier healthcare (employee dependents) including dental and vision.
  • Retirement: 401(k) with company matching.
  • Time Off: Unlimited PTO plus 13 paid holidays.
  • Parental Leave: 12 weeks of paid leave for all new parents.
  • Equipment: Choice of top-of-the-line laptops/workstations.
  • Perks: Annual offsites and fully stocked office.

Salary : $180,000 - $270,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco?

Sign up to receive alerts about other jobs on the Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,744 - $120,340
Income Estimation: 
$113,640 - $142,321
Income Estimation: 
$101,952 - $131,428
Income Estimation: 
$114,502 - $144,630
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at ChatGPT Jobs

  • ChatGPT Jobs Columbus, OH
  • Job Description Principal Security Engineer - GenAI and Emerging Tech - Remote The Hartford Chicago, IL On-site, Remote Full-time Posted 5 hours ago Job De... more
  • 6 Days Ago

  • ChatGPT Jobs York, NY
  • Job Description Job Details Expert Professionals — STEM Research Company: Mercor Location: Remote (New York, NY Remote) Type: Contract Compensation: $70–$1... more
  • 6 Days Ago

  • ChatGPT Jobs York, NY
  • Job Description CEOs & Founders (50 Employees) Company: Mercor Location: Remote (New York, NY based) Type: Contract Compensation: $5,000 About the Role: He... more
  • 6 Days Ago

  • ChatGPT Jobs Addison, TX
  • Job Description AI & Automation Engineer Job Description AI & Automation Engineer Job Details Company: Braviant Holdings Location: Addison, TX (On-site, 4 ... more
  • 6 Days Ago


Not the job you're looking for? Here are some other Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco jobs in the San Francisco, CA area that may be a better fit.

  • Plaud San Francisco, CA
  • About Plaud Inc. Plaud is building the world's most trusted AI work companion for professionals to elevate productivity and performance through note-taking... more
  • 4 Days Ago

  • Plaud San Francisco, CA
  • About Plaud Inc. Plaud is building the world's most trusted AI work companion for professionals to elevate productivity and performance through note-taking... more
  • 2 Days Ago

AI Assistant is available now!

Feel free to start your new journey!