Demo

AI Inference Engineer

Quadric
Burlingame, CA Full Time
POSTED ON 11/25/2025
AVAILABLE BEFORE 5/23/2026

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C DSP and control code.

Role:

The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the model deployment for efficient inference; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI model algorithms, system architecture and AI toolchains/frameworks.

Responsibilities:

  • Quantize, prune and convert models for deployment
  • Port models to Quadric platform using Quadric toolchain
  • Optimize inference deployment for latency, speed
  • Benchmark and profile model performance and accuracy
  • Develop tools to scale and speed up the deployment
  • Make Improvement to SDK and runtime
  • Provide technical support and documents to customers and developer community


Requirements

Requirements:

  • Bachelor's or Master's in Computer Science and/or Electric Engineering.
  • 5 years of experience in AI/LLM model inference and deployment frameworks/tools
  • experience with model quantization (PTQ, QAT) and tools
  • experience with model accuracy measures
  • experience with model inference performance profiling
  • experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
  • Proficiency in C/C and Python
  • Demonstrate good capability in problem solving, debug and communication


Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Family Leave (Maternity, Paternity)
  • Short Term & Long Term Disability
  • Training & Development
  • Work From Home
  • Free Food & Snacks
  • Stock Option Plan

Salary.com Estimation for AI Inference Engineer in Burlingame, CA
$156,050 to $192,224
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI Inference Engineer?

Sign up to receive alerts about other jobs on the AI Inference Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$179,606 - $233,815
Income Estimation: 
$211,413 - $298,244
Income Estimation: 
$184,796 - $233,226
Income Estimation: 
$179,606 - $233,815
Income Estimation: 
$77,900 - $95,589
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Quadric

  • Quadric Burlingame, CA
  • Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to ... more
  • 16 Days Ago

  • Quadric Burlingame, CA
  • Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to ... more
  • 8 Days Ago

  • Quadric Burlingame, CA
  • Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to ... more
  • 9 Days Ago

  • Quadric Burlingame, CA
  • Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to ... more
  • 2 Days Ago


Not the job you're looking for? Here are some other AI Inference Engineer jobs in the Burlingame, CA area that may be a better fit.

  • Virtue AI San Francisco, CA
  • About Virtue AI Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security,... more
  • 24 Days Ago

  • Together AI San Francisco, CA
  • About The Role Together AI is seeking a Rust Systems Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of ou... more
  • 27 Days Ago

AI Assistant is available now!

Feel free to start your new journey!