Demo

AI Inference Engineer - Speech

Zoom
San Jose, CA Full Time
POSTED ON 4/29/2026
AVAILABLE BEFORE 5/28/2026
What You Can Expect

We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work on the most cutting edge speech modeling and inference technologies with world-class speech scientists. This role will include collaboration with cross-functional teams, including product, science engineering teams, and infrastructure teams, to deliver high-impact projects from the ground up.

About The Team

Zoom's AI Speech Team is developing speech recognition technologies to improve Zoom's conversational AI experience. This work impacts various products, like Zoom AI Companion, Zoom Meetings and Workplace, Zoom Contact Center, Zoom Phone, Zoom Revenue Accelerator, etc. Our team's mission is to equip the powerful AI brain with human-level listening and understanding undefined for voice input.

As an AI Inference Engineer, you will develop novel speech model inference solutions on modern AI inference hardware, such as GPU, TPU and AI-specific chips. Our goal is to deliver the most unique AI-powered collaboration platform to users across the globe.

Responsibilities

  • Developing state-of-the-art speech services for Zoom products. Devising novel techniques where off-the-shelf solutions are not available.
  • Optimizing ASR inference systems for production deployment, including inference latency, throughput, memory footprint, and resource utilization.
  • Optimizing model inference performance by diving deep into the lower stack of inference frameworks, with a focus on hardware-specific optimizations for Nvidia GPUs.
  • Proposing new model structures by joint optimization of model accuracy and inference speed.
  • Designing and developing ASR systems with low latency and high accuracy requirements, while ensuring scalability of GPU infrastructure and improving throughput of ASR service.
  • Profiling and debugging ASR runtime performance bottlenecks across different deployment hardware and environments.

What We’re Looking For

  • Possess a Master's in Computer Science, Electrical Engineering or related fields with 3 years of experience in speech recognition, speech-llm or AI model inference.
  • Display knowledge in deep learning and hands-on programming skills in Python, shell scripts, C/C ; familiarity with ML frameworks such as PyTorch and TensorFlow.
  • Demonstrate deep understanding of transformer encoder-decoder frameworks for speech recognition, including attention mechanisms, beam search and sequence-to-sequence modeling for end-to-end ASR systems.
  • Understand recent advancements in speech foundation models and speech-LLMs that integrate acoustic and linguistic representations, enabling unified modeling for speech understanding and transcription tasks.
  • Have experience in optimizing deep learning model inference on NVIDIA GPUs, including profiling and accelerating AI models using CUDA, TensorRT, and mixed-precision computation to achieve low latency, high-throughput performance.
  • Have experience developing and tuning custom CUDA kernels, leveraging CUDA Graphs for efficient execution scheduling, and minimizing kernel launch overhead to maximize GPU utilization.
  • Be proficient in end-to-end performance analysis, memory optimization, and deployment of largescale ML models on GPU clusters. Experienced with stream management, asynchronous execution, and integrating frameworks such as PyTorch and TensorFlow for real-time inference.

Minimum

Salary Range or On Target Earnings:

$151,800.00

Maximum

$332,200.00

In addition to the base salary and/or OTE listed Zoom has a Total Direct Compensation philosophy that takes into consideration; base salary, bonus and equity value.

Note: Starting pay will be based on a number of factors and commensurate with qualifications & experience.

We also have a location based compensation structure; there may be a different range for candidates in this and other locations

At Zoom, we offer a window of at least 5 days for you to apply because we believe in giving you every opportunity. Below is the potential closing date, just in case you want to mark it on your calendar. We look forward to receiving your application!

Anticipated Position Close Date

05/06/26

Ways of Working

Our structured hybrid approach is centered around our offices and remote work environments. The work style of each role, Hybrid, Remote, or In-Person is indicated in the job description/posting.

Benefits

As part of our award-winning workplace culture and commitment to delivering happiness, our benefits program offers a variety of perks, benefits, and options to help employees maintain their physical, mental, emotional, and financial health; support work-life balance; and contribute to their community in meaningful ways. Click Learn for more information.

About Us

Zoomies help people stay connected so they can get more done together. We set out to build the best collaboration platform for the enterprise, and today help people communicate better with products like Zoom Contact Center, Zoom Phone, Zoom Events, Zoom Apps, Zoom Rooms, and Zoom Webinars.

We’re problem-solvers, working at a fast pace to design solutions with our customers and users in mind. Find room to grow with opportunities to stretch your skills and advance your career in a collaborative, growth-focused environment.

Our Commitment

At Zoom, we believe great work happens when people feel supported and empowered. We’re committed to fair hiring practices that ensure every candidate is evaluated based on skills, experience, and potential. If you require an accommodation during the hiring process, let us know—we’re here to support you at every step.

If you need assistance navigating the interview process due to a medical disability, please submit an Accommodations Request Form and someone from our team will reach out soon. This form is solely for applicants who require an accommodation due to a qualifying medical disability. Non-accommodation-related requests, such as application follow-ups or technical issues, will not be addressed.

Salary : $151,800 - $332,200

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI Inference Engineer - Speech?

Sign up to receive alerts about other jobs on the AI Inference Engineer - Speech career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Zoom

  • Zoom Seattle, WA
  • What You Can Expect We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will d... more
  • Just Posted

  • Zoom Atlanta, GA
  • What You Can Expect We are seeking a strategic and data-driven Churn Segment Leader to join our team and focus on our Solopreneur and Micro customer base (... more
  • Just Posted

  • Zoom Seattle, WA
  • What You Can Expect You will architect high-performance C backend systems powering Zoom's AI-driven Contact Center platform. Working with distributed telep... more
  • 3 Days Ago

  • Zoom Virginia, VA
  • What You Can Expect We are seeking a talented individual Majors Account Executive with specialized experience in Contact Center sales. Understands the indu... more
  • 3 Days Ago


Not the job you're looking for? Here are some other AI Inference Engineer - Speech jobs in the San Jose, CA area that may be a better fit.

  • NVIDIA AI Santa Clara, CA
  • Job Requisition ID JR2014497 Job Category Engineering Time Type Full time NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming market, re... more
  • 3 Days Ago

  • Hippocratic AI Palo Alto, CA
  • About Us Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically imp... more
  • 21 Days Ago

AI Assistant is available now!

Feel free to start your new journey!