Demo

AI Model Optimization Architect

Qualcomm Technologies
San Diego, CA Full Time
POSTED ON 4/16/2026
AVAILABLE BEFORE 5/17/2026
Company:
Qualcomm Technologies, Inc.

Job Area:
Engineering Group, Engineering Group > Machine Learning Engineering

General Summary:

Qualcomm is leveraging its strengths in compute, connectivity, and AI acceleration to play a central role in the evolution of Cloud AI. The Qualcomm Cloud AI team develops hardware and software platforms enabling efficient inference of large-scale foundation models.

We are seeking a Staff Engineer - AI Model Optimization Architect to lead end-to-end model transformation and optimization for LLMs, VLMs, diffusion, and multimodal models on Qualcomm inference accelerators. This role works closely with compiler, performance, and accuracy teams to translate models into accelerator efficient execution while balancing throughput, latency, memory, and quality. The scope spans Day0 enablement through production deployment, with a strong emphasis on scaling optimizations to future architectures.

Key Responsibilities
Architect and deliver model optimization strategies that transform PyTorch models for efficient inference on Qualcomm accelerators.
Drive graph capture and deployment using PyTorch, ONNX, and torch.compile, including model rewrites and graph-level transformations.
Design and implement fusion kernels using DSL based approaches (e.g., Triton), enabling fused operations and performance critical algorithmic rewrites.
Partner deeply with compiler, performance, and accuracy teams to co-design lowering strategies, kernel fusion, layout decisions, and runtime integration.
Profile and optimize LLM/VLM/diffusion inference for throughput and latency across batch sizes, sequence lengths, and serving modes.
Own transformer specific optimizations including KVcache management, decoding behavior, and long context performance.
Enable and optimize continuous batching (dynamic/iteration-level scheduling), understanding its impact on memory, scheduling, and tail latency.
Architect and scale distributed inference strategies (e.g., sharding and parallelism) across multi-core and multi-device systems.
Establish reusable approaches to scale model optimizations to new hardware architectures, creating robust patterns and tooling.
Debug complex performance or stability issues to root cause and drive production ready solutions.

Required Qualifications
Expert level expertise in PyTorch and inference focused model optimization; strong Python engineering skills.
Hands on experience with torch.compile / TorchDynamo or related graph capture and compilation workflows.
Deep understanding of transformer architectures, attention mechanisms, MoEs, and performance trade-offs.
Practical experience with KVcache behavior, serving time optimizations, and memory/performance tradeoffs.
Strong foundation in computer architecture, ML accelerators, and distributed systems.
Proven ability to lead cross-functional technical efforts and influence design decisions.
MS in Computer Science, Machine Learning, Computer Engineering, or Electrical Engineering, or equivalent experience.

Preferred / Bonus Qualifications
Experience developing fusion kernels using Triton or similar DSLs, and collaborating with ML compiler teams.
Familiarity with LLM serving stacks and continuous batching systems.
Background in numerical methods, performance/accuracy trade-off analysis, or evaluation frameworks.
PhD in a relevant field.

Minimum Qualifications:
Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 4 years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
OR
Master's degree in Computer Science, Engineering, Information Systems, or related field and 3 years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
OR
PhD in Computer Science, Engineering, Information Systems, or related field and 2 years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.

Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, rest assured that Qualcomm is committed to providing an accessible process. You may e-mail or call Qualcomm's toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to be able participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. (Keep in mind that this email address is used to provide reasonable accommodations for individuals with disabilities. We will not respond here to requests for updates on applications or resume inquiries).

To all Staffing and Recruiting Agencies: Our Careers Site is only for individuals seeking a job at Qualcomm. Staffing and recruiting agencies and individuals being represented by an agency are not authorized to use this site or to submit profiles, applications or resumes, and any such submissions will be considered unsolicited. Qualcomm does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to our jobs alias, Qualcomm employees or any other company location. Qualcomm is not responsible for any fees related to unsolicited resumes/applications.

EEO Employer: Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification.

Qualcomm expects its employees to abide by all applicable policies and procedures, including but not limited to security and other requirements regarding protection of Company confidential information and other confidential and/or proprietary information, to the extent those requirements are permissible under applicable law.

Pay range and Other Compensation & Benefits:
$158,400.00 - $237,600.00

The above pay scale reflects the broad, minimum to maximum, pay scale for this job code for the location for which it has been posted. Even more importantly, please note that salary is only one component of total compensation at Qualcomm. We also offer a competitive annual discretionary bonus program and opportunity for annual RSU grants (employees on sales-incentive plans are not eligible for our annual bonus). In addition, our highly competitive benefits package is designed to support your success at work, at home, and at play. Your recruiter will be happy to discuss all that Qualcomm has to offer - and you can review more details about our US benefits at this link.

If you would like more information about this role, please contact Qualcomm Careers.

Salary : $158,400 - $237,600

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI Model Optimization Architect?

Sign up to receive alerts about other jobs on the AI Model Optimization Architect career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Qualcomm Technologies

  • Qualcomm Technologies San Diego, CA
  • Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group > PMIC Systems General Summary: Qualcomm is a company of inventors that... more
  • 12 Days Ago

  • Qualcomm Technologies San Diego, CA
  • Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group > Video Systems, HW Architecture General Summary: Qualcomm Computer Vis... more
  • 1 Day Ago

  • Qualcomm Technologies San Diego, CA
  • Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group > Hardware Engineering General Summary: This individual independently p... more
  • 1 Day Ago

  • Qualcomm Technologies San Diego, CA
  • Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group > Machine Learning Engineering General Summary: Qualcomm AI Research is... more
  • 1 Day Ago


Not the job you're looking for? Here are some other AI Model Optimization Architect jobs in the San Diego, CA area that may be a better fit.

  • Zoox San Diego, CA
  • The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence. As a... more
  • 7 Days Ago

  • Qualcomm Technologies San Diego, CA
  • Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group > Machine Learning Engineering General Summary: About Qualcomm Robotics... more
  • 11 Days Ago

AI Assistant is available now!

Feel free to start your new journey!