What are the responsibilities and job description for the Machine Learning Engineer, Speech LLM Training position at Jobright.ai?
This role is part of the Jobright TNT - the private hiring network connecting top talent with top AI startups like Perplexity, Mercor, Cresta, Suno and 150 more.
This is not a mass job posting. Only select, high-signal candidates are invited to Jobright TNT and recommended directly to hiring teams
Hiring Company: Plaud
One-liner: Plaud Inc. is a hardware and software company making AI-powered voice recorders for note-taking, with over 1M users globally.
Salary: $200K/yr - $540K/yr
Why Join Us:
Role Responsibilities
• Have a proven track record of building and training large-scale audio or speech models from the ground up, whether that involves unified SpeechLLMs, advanced ASR, expressive TTS, or generative audio architectures
• Love living at the intersection of research and engineering, eager to design novel sequence modeling architectures one day and debug distributed training clusters the next
• Are highly comfortable traversing the entire stack—from fundamental signal processing and raw acoustic representations to massive foundation model training and edge-device optimization
• Possess deep expertise in PyTorch or JAX, with battle scars from optimizing large-scale distributed training runs, managing GPU memory utilization, and resolving complex performance bottlenecks
• Thrive in a fast-paced, high-growth startup environment where you are expected to take extreme ownership of ambiguous problems and drive them directly into production
• Are obsessed with building AI systems that natively understand and generate speech, ultimately creating a hardware-software AI companion that amplifies human productivity
Qualitications
Required
• Have a proven track record of building and training large-scale audio or speech models from the ground up, whether that involves unified SpeechLLMs, advanced ASR, expressive TTS, or generative audio architectures
• Love living at the intersection of research and engineering, eager to design novel sequence modeling architectures one day and debug distributed training clusters the next
• Are highly comfortable traversing the entire stack—from fundamental signal processing and raw acoustic representations to massive foundation model training and edge-device optimization
• Possess deep expertise in PyTorch or JAX, with battle scars from optimizing large-scale distributed training runs, managing GPU memory utilization, and resolving complex performance bottlenecks
• Thrive in a fast-paced, high-growth startup environment where you are expected to take extreme ownership of ambiguous problems and drive them directly into production
• Are obsessed with building AI systems that natively understand and generate speech, ultimately creating a hardware-software AI companion that amplifies human productivity
Preferred
• Text-based LLMs: Hands-on experience with core text-based Large Language Model pretraining, instruction tuning, or RLHF
• Neural Audio Codecs: Hands-on experience designing and training state-of-the-art neural audio codecs for streamable, high-fidelity audio
• Generative Architectures: Designing and training diffusion models, flow matching, or autoregressive architectures specifically for speech and voice generation
• Alignment & Steerability: Applying Reinforcement Learning (RL) techniques (like RLHF or GRPO) to improve conversational cadence, steerability, and alignment in foundation models
• Deep System Optimization: End-to-end inference and performance optimization, leveraging high-throughput serving frameworks (e.g., vLLM, TensorRT-LLM, SGLang) to minimize latency for real-time cloud streaming
• Large-Scale Infrastructure: Managing massive GPU clusters, utilizing advanced distributed training frameworks (e.g., FSDP, DeepSpeed), and navigating orchestration tools like Kubernetes
How can I join Jobright TNT:
If this is your first time applying to a Jobright TNT role, the process works as follows:
1. Apply to your first Jobright TNT role
2. We review your background to determine if you meet the TNT quality bar
3. If qualified, your application is directly recommended to the employer
4. Once accepted into TNT, you may be:
- Invited to apply for other exclusive TNT-only roles
- Invited to private, invite-only hiring events with top startups
You will be notified of your TNT selection result.
PS: All Jobright TNT roles are 100% real, directly hired by top AI startups we partner with, and come with priority review and higher response rates than the normal application queue.
Salary : $200,000 - $540,000