Demo

Engineering Manager, LLM Performance

NVIDIA AI
Santa Clara, CA Full Time
POSTED ON 6/25/2026
AVAILABLE BEFORE 7/23/2026
Job Requisition ID

JR2019950

Job Category

Engineering

Time Type

Full time

At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. We are accelerating LLM inference across the stack and across all open source LLM frameworks like TensorRT LLM, vLLM and SGLang. With demand for AI exploding, particularly in the realm of large language models (LLMs) and vision language models (VLMs, VLAs), we are significantly expanding our team.

We're seeking a highly skilled and driven Engineering Manager to take the lead in accelerating the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. This is a high-impact, hands-on leadership role at the intersection of deep technical expertise and world-class management. You won't just manage; you'll architect and guide a brilliant team of engineers who are pushing the performance of LLM inference. Your work will be highly collaborative, interfacing directly with NVIDIA Researchers, GPU Architects, and other teams across the company to ensure we ship production-grade, lightning-fast software that sets the global standard for AI performance.

What You’ll Be Doing

  • Lead and grow a team responsible for pushing the performance of LLM inference across multiple LLM frameworks, including TensorRT LLM, vLLM, SGLang and Dynamo on our datacenter products.
  • Drive the design, implementation and optimization of features that are key to performance in LLM inference.
  • Continuously improve the performance of LLM inference on current and upcoming NVIDIA datacenter architectures and GPUs.
  • Continuously improve the performance of LLM inference of important foundation models.
  • Work with inference benchmark teams to help tune performance for key workloads.
  • Integrating cutting-edge technologies developed at NVIDIA and offering an intuitive developer experience for LLM deployment.
  • Lead software development execution, with responsibility for project planning, milestone delivery, and cross-functional coordination.

What We Need To See

  • MS, PhD, or equivalent experience in Computer Science, Computer Engineering, AI, or a related technical field.
  • 7 overall years of overall software engineering experience, including 3 years of technical leadership experience.
  • Proven ability to lead and scale high-performing engineering teams, especially across distributed and cross-functional groups.
  • Strong background in C or Python, with expertise in software design and delivering production-quality software libraries.
  • Demonstrated expertise in large language models (LLM) and/or vision language models (VLM) and/or inference in general.

Ways To Stand Out From The Crowd

  • Deep understanding of GPU architecture, CUDA programming, and system-level performance tuning.
  • Background in LLM inference or working with frameworks such as TensorRT-LLM, vLLM, or SGLang.
  • Passion for building scalable, user-friendly APIs and enabling developers in the AI ecosystem.
  • Have a proven track record of growing and managing a team that encourages idea sharing, empowers team members, and provides opportunities for professional growth.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 3, and 272,000 USD - 431,250 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until June 27, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Salary.com Estimation for Engineering Manager, LLM Performance in Santa Clara, CA
$228,522 to $270,413
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Engineering Manager, LLM Performance?

Sign up to receive alerts about other jobs on the Engineering Manager, LLM Performance career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$156,679 - $196,968
Income Estimation: 
$222,941 - $284,552
Income Estimation: 
$190,687 - $235,769
Income Estimation: 
$218,238 - $263,470
Income Estimation: 
$213,354 - $274,761
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at NVIDIA AI

  • NVIDIA AI Santa Clara, CA
  • Job Requisition ID JR2018915 Job Category Engineering Time Type Full time Intelligent machines powered by Artificial Intelligence computers that can learn,... more
  • 1 Day Ago

  • NVIDIA AI Santa Clara, CA
  • Job Requisition ID JR2016209 Job Category Engineering Time Type Full time We are now looking for an Offensive Hardware Security Researcher! NVIDIA is seeki... more
  • 1 Day Ago

  • NVIDIA AI Seattle, WA
  • Job Requisition ID JR2014841 Job Category Engineering Time Type Full time NVIDIA has been transforming computer graphics, PC gaming, and accelerated comput... more
  • 2 Days Ago

  • NVIDIA AI Durham, NC
  • Job Requisition ID JR2019652 Job Category Engineering Time Type Full time NVIDIA has been transforming computer graphics, PC gaming, and accelerated comput... more
  • 2 Days Ago


Not the job you're looking for? Here are some other Engineering Manager, LLM Performance jobs in the Santa Clara, CA area that may be a better fit.

  • LinkedIn Mountain View, CA
  • Company Description LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Ou... more
  • 6 Days Ago

  • Apple Cupertino, CA
  • Software Engineering Program Manager, Core OS Performance Cupertino, California, United States Software and Services Summary Posted: Apr 07, 2026 Weekly Ho... more
  • 24 Days Ago

AI Assistant is available now!

Feel free to start your new journey!