Demo

Machine Learning Engineer

Oho Group
San Jose, CA Full Time
POSTED ON 6/5/2026
AVAILABLE BEFORE 7/4/2026

Senior/Lead Inference Systems Engineer (ML/AI)


About the Role


We're looking for a Senior or Lead Inference Systems Engineer to help build the next generation of large-scale AI inference infrastructure. Working alongside world-class hardware and software engineers, you'll play a key role in developing high-performance inference serving systems and cluster scheduling technologies that maximise the efficiency of modern foundation models.


This is an exciting opportunity to work at the cutting edge of distributed AI systems, helping shape how large language models are deployed, scaled, and optimised across heterogeneous compute environments.


Key Responsibilities


  • You will help design, build, and optimise large-scale inference serving platforms capable of delivering industry-leading throughput, latency, and efficiency.
  • You'll get the chance to develop and refine multi-node inference strategies that maximise performance across distributed compute clusters.
  • You will work on advanced optimisation techniques including tensor parallelism, pipeline parallelism, expert parallelism, continuous batching, and KV cache management.
  • This is an excellent opportunity for you to collaborate with hardware and systems teams to optimise workloads across compute, networking, and storage infrastructure.
  • You'll be responsible for driving performance improvements across leading inference frameworks such as vLLM, SGLang, and PyTorch.
  • You will contribute to the design and implementation of cluster scheduling systems that intelligently allocate resources and maximise utilisation at scale.
  • You'll get the opportunity to engage with the open-source community, contributing optimisations upstream and helping influence the future direction of widely adopted AI infrastructure projects.
  • You will help establish best practices around benchmarking, testing, debugging, and performance analysis to ensure a highly reliable production-grade platform.


Required Qualifications


  • You'll need strong software engineering experience with Python, C , and PyTorch.
  • You should have experience developing or contributing to modern LLM inference serving frameworks such as vLLM, SGLang, or equivalent technologies.
  • You must possess a deep understanding of large language model inference, including attention mechanisms, batching strategies, KV cache management, and serving optimisation techniques.
  • You'll need hands-on experience deploying, operating, or optimising large-scale distributed workloads across multi-node compute environments.
  • Experience with performance profiling, benchmarking, debugging, and system-level optimisation is essential.
  • You should be comfortable working in fast-paced engineering environments and collaborating across multiple technical disciplines.


Preferred Qualifications


  • Experience with distributed scheduling systems, cluster orchestration, resource management, or workload optimisation technologies.
  • Exposure to networking, storage systems, distributed caching, or infrastructure platforms supporting large-scale AI deployments.
  • Experience working with technologies such as Orca, LMCache, or similar distributed inference optimisation frameworks.
  • Knowledge of GPU programming and performance optimisation using CUDA, Triton, ROCm, or related technologies.
  • Understanding of AI accelerator architectures and large-scale heterogeneous computing environments.
  • Experience contributing to open-source AI infrastructure projects.


Education


You should be educated to Master's or PhD level in Computer Science, Computer Engineering, Electrical Engineering, or a related technical discipline, or possess equivalent industry experience.

Salary : $250,000 - $350,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Machine Learning Engineer?

Sign up to receive alerts about other jobs on the Machine Learning Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$146,673 - $180,130
Income Estimation: 
$93,066 - $107,206
Income Estimation: 
$127,185 - $158,219
Income Estimation: 
$116,275 - $131,033
Income Estimation: 
$118,386 - $167,771
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Oho Group

  • Oho Group San Jose, CA
  • About the Role Incredible opportunity to work with a deep-tech company building foundational software for next-generation AI and high-performance computing... more
  • 2 Days Ago

  • Oho Group Los Angeles, CA
  • Why This Role You’ll work on real, large-scale aerospace systems where embedded software directly drives hardware performance. This is a hands-on role for ... more
  • 2 Days Ago

  • Oho Group Boston, MA
  • The Opportunity Join an advanced technology company developing complex, high-performance systems that integrate hardware and software. As an Optical Engine... more
  • 3 Days Ago

  • Oho Group Austin, TX
  • The Opportunity Join a high-performance engineering team building advanced systems from the ground up. You’ll work close to the hardware, developing embedd... more
  • 3 Days Ago


Not the job you're looking for? Here are some other Machine Learning Engineer jobs in the San Jose, CA area that may be a better fit.

  • pony.ai Fremont, CA
  • Founded in 2016 in Silicon Valley, Pony.ai has quickly become a global leader in autonomous mobility and is a pioneer in extending autonomous mobility tech... more
  • 9 Days Ago

  • LeanData Santa Clara, CA
  • LeanData helps the world’s fastest-growing companies automate, simplify, and accelerate revenue. We are looking for a curious and innovative Machine Learni... more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!