What are the responsibilities and job description for the Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco position at ChatGPT Jobs?

Job Description

Job Description - Plaud Inc.

Speech Evaluation Engineer (Speech LLM)

Company: Plaud Inc.

Location: San Francisco, CA

Type: On-site (Hybrid: Minimum 3x in-office per week)

Machine Learning & Artificial Intelligence

Job Overview

Plaud is seeking a candidate to turn ambiguous concepts like voice naturalness and cadence into clear, automated metrics. You will partner with ML researchers to define benchmarks for Speech LLMs, build scalable data pipelines, and own dashboards that track model health and performance.

Key Responsibilities

Define and automate metrics for subjective concepts such as naturalness, expressiveness, and conversational cadence.
Build reliable distributed systems and data pipelines that run at scale against live model checkpoints.
Partner with ML researchers to translate Speech LLM capabilities (e.g., ASR robustness, TTS emotional steerability) into measurable benchmarks.
Develop and own dashboards to track model health during training, improve signal-to-noise ratios, and reduce evaluation latency.
Debug anomalous mid-training results to identify root causes (architecture, data, or infrastructure).
Communicate complex statistical results and model behaviors to technical and non-technical stakeholders.

Required Qualifications

Engineering Skills: Strong software engineering skills, particularly in Python, with experience in distributed systems and evaluation harnesses.
ML Collaboration: Ability to deeply partner with researchers to define "good" performance for AI models.
Observability: Experience building trusted tracking dashboards (e.g., Weights & Biases, MLflow).
Communication: Ability to clearly articulate complex statistical results.

Preferred Qualifications

Speech Metrics: Familiarity with WER, CER, PESQ, and automated MOS scoring frameworks.
LLM-as-a-Judge: Experience using frontier or fine-tuned multi-modal LLMs to evaluate conversational logic, transcription accuracy, and audio quality.
Human Evaluation: Background in managing large-scale crowdsourcing for RLHF/DPO efforts.
Adversarial Datasets: Experience curating datasets to test edge cases (heavy accents, overlapping speech, noisy environments).

Compensation & Benefits

Salary: $180,000 - $270,000 base salary performance bonus Equity.
Healthcare: Top-tier healthcare (employee dependents) including dental and vision.
Retirement: 401(k) with company matching.
Time Off: Unlimited PTO plus 13 paid holidays.
Parental Leave: 12 weeks of paid leave for all new parents.
Equipment: Choice of top-of-the-line laptops/workstations.
Perks: Annual offsites and fully stocked office.

Salary : $180,000 - $270,000

Apply for this job

Receive alerts for other Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco job openings

Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco

What are the responsibilities and job description for the Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco position at ChatGPT Jobs?

What is the career path for a Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco?

Job openings at ChatGPT Jobs

Not the job you're looking for? Here are some other Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco jobs in the San Francisco, CA area that may be a better fit.

We don't have any other Machine Learning Engineer, Model Evaluations (Speech LLM) - San Francisco jobs in the San Francisco, CA area right now.

AI Assistant is available now!