Demo

Sr. Software Engineer- AI/ML, AWS Neuron Apps

Annapurna Labs (U.S.) Inc.
Seattle, WA Full Time
POSTED ON 4/18/2026
AVAILABLE BEFORE 8/16/2026

DESCRIPTION

Shape the Future of AI Accelerators at AWS Neuron

Join the elite team behind AWS Neuron—the software stack powering AWS's next-generation AI accelerators Inferentia and Trainium. As a Senior Software Engineer in our Machine Learning Applications team, you'll be at the forefront of deploying and optimizing some of the world's most sophisticated AI models at unprecedented scale.

What You'll Impact:
  • Pioneer distributed inference solutions for industry-leading LLMs such as GPT, Llama, Qwen
  • Optimize breakthrough language and vision generative AI models
  • Collaborate directly with silicon architects and compiler teams to push the boundaries of AI acceleration
  • Drive performance benchmarking and tuning that directly impacts millions of inference calls globally

Key job responsibilities
You will drive the Evolution of Distributed AI at AWS Neuron

As a Technical Leader at the forefront of AWS's AI Accelerator, you'll architect the bridge between ML frameworks including PyTorch, JAX and AI hardware. This isn't just about just optimization—it's about revolutionizing how AI models run at scale.

Technical Impact You'll Drive:
  • Spearhead distributed inference architecture for PyTorch and JAX using XLA
  • Engineer breakthrough performance optimizations for AWS Trainium and Inferentia
  • Develop ML tools to enhance LLM accuracy and efficiency
  • Transform complex tensor operations into highly optimized hardware implementations
  • Pioneer benchmarking methodologies that shape next-gen AI accelerator design

What Makes This Role Unique:
  • Direct influence on AWS's AI infrastructure used by thousands of ML applications
  • Full-stack optimization from high-level frameworks to hardware-specific primitives
  • Creation of tools and frameworks that define industry standards for ML deployment
  • Collaboration with both open-source ML communities and hardware architecture teams

Your Technical Arsenal Should Include:
  • Deep expertise in Python and ML framework internals
  • Strong understanding of distributed systems and ML optimization
  • Passion for performance tuning and system architecture

A day in the life
Work/Life Balance
Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.

Mentorship & Career Growth
Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.

About the team
At AWS Neuron, we're revolutionizing how the world's most sophisticated AI models run at scale through Amazon's next-generation AI accelerators. Operating at the unique intersection of ML frameworks and custom silicon, our team drives innovation from silicon architecture to production software deployment.
We pioneer distributed inference solutions for PyTorch and JAX using XLA, optimize industry-leading LLMs like GPT and Llama, and collaborate directly with silicon architects to influence the future of AI hardware. Our systems handle millions of inference calls daily, while our optimizations directly impact thousands of AWS customers running critical AI workloads.
We're focused on pushing the boundaries of large language model optimization, distributed inference architecture, and hardware-specific performance tuning. Our deep technical experts transform complex ML challenges into elegant, scalable solutions that define how AI workloads run in production.

BASIC QUALIFICATIONS

  • 5 years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • 5 years of programming experience using Python or C and PyTorch.
  • Experience with AI acceleration via quantization, parallelism, model compression, batching, KV caching, vllm serving
  • Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators
  • Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on optimizations for improving the model execution.

PREFERRED QUALIFICATIONS

  • Master's degree in computer science or equivalent
  • Master's degree in machine learning or equivalent
  • Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators
  • Experience in developing CUDA kernels, HPC and inference optimization, tensors operations

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.


The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.


USA, WA, Seattle - 168,100.00 - 227,400.00 USD annually

Benefits:

Vacation & Paid Time Off, Health Insurance, Employee Discounts

Salary : $168,100 - $227,400

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Annapurna Labs (U.S.) Inc.

  • Annapurna Labs (U.S.) Inc. Seattle, WA
  • DESCRIPTION AWS Trainium is deployed at scale, with millions of chips in production, and has been used for training and inference of frontier models. AWS N... more
  • 13 Days Ago

  • Annapurna Labs (U.S.) Inc. Cupertino, CA
  • DESCRIPTION When a new Trainium or Inferentia chip comes back from the fab, our code is the first software to touch it. We're looking for a hands-on engine... more
  • 3 Days Ago

  • Annapurna Labs (U.S.) Inc. Cupertino, CA
  • DESCRIPTION AWS Utility Computing (UC) provides product Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innov... more
  • 5 Days Ago

  • Annapurna Labs (U.S.) Inc. Cupertino, CA
  • DESCRIPTION One C codebase. Three radically different execution environments. We're looking for an engineering manager who thinks in terms of platforms, ab... more
  • 5 Days Ago


Not the job you're looking for? Here are some other Sr. Software Engineer- AI/ML, AWS Neuron Apps jobs in the Seattle, WA area that may be a better fit.

  • Amazon Web Services (AWS) Seattle, WA
  • Description Shape the Future of AI Accelerators at AWS Neuron Join the elite team behind AWS Neuron—the software stack powering AWS's next-generation AI ac... more
  • 17 Days Ago

  • Amazon Seattle, WA
  • Description Shape the Future of AI Accelerators at AWS Neuron Join the elite team behind AWS Neuron—the software stack powering AWS's next-generation AI ac... more
  • 19 Days Ago

AI Assistant is available now!

Feel free to start your new journey!