Demo

Senior AI Software Architect

Jobs via Dice
Redmond, WA Full Time
POSTED ON 12/26/2025
AVAILABLE BEFORE 1/24/2026
Overview

Do you want to be at the forefront of innovating the latest hardware designs to propel Microsoft's cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross-team collaboration with business insight and strategy?

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to achieve our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Join the Systems Planning and Architecture (SPARC) team within Microsoft's Azure Hardware Systems and Infrastructure (AHSI) organization, the team behind Microsoft's expanding Cloud Infrastructure and for powering Microsoft's "Intelligent Cloud" mission. Microsoft delivers more than 200 online services to more than one billion individuals worldwide, and AHSI is the team behind our expanding cloud infrastructure. We deliver the core infrastructure and foundational technologies for Microsoft's cloud businesses including Microsoft Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.

We are seeking a highly skilled Senior AI Software Architect to join our team focused on model enablement and performance optimization for Maia accelerators. This role is ideal for someone with strong experience in PyTorch-based model development, quantization techniques, and parallelization strategies at the framework level. You will work closely with hardware and software teams to bring up models on Maia and ensure they run efficiently.

Responsibilities

Model Enablement:

  • Port and optimize large-scale AI models (e.g., foundation models, diffusion models, YOLO) to run efficiently on Maia hardware.
  • Integrate models using frameworks such as PyTorch, ONNX, vLLM, and SGLang.

Performance Optimization:

  • Apply techniques like KV cache quantization (e.g., BF16 FP8), checkpointing, and re-sharding for efficient inference and training.
  • Experiment with parallelism strategies (TP, PP) and analyze performance impacts across interconnects (NVLink vs PCIe).

Inference Stack Development:

  • Collaborate on improving inference pipelines, including KV caching in sglang/vllm and performance tuning at the PyTorch level.
  • Work with Triton kernels for basic operations (e.g., FP8 dequantization) and assist in kernel performance analysis.

Cross-Team Collaboration:

  • Partner with hardware architects and kernel developers for co-design discussions.
  • Communicate effectively with multiple stakeholders to align on performance goals and deliverables.

Qualifications

Required Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 4 years technical engineering experience with coding in languages including, but not limited to, C, C , C#, Java, JavaScript, or Python OR equivalent experience.

Other Requirements:

  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications

  • Bachelor's Degree in Computer Science or Engineering.
  • 3 years of strong hands-on experience with PyTorch and model optimization techniques.
  • Practical knowledge of quantization techniques like PTQ/QAT especially for KV cache quantization.
  • Familiarity with parallelization strategies and distributed training concepts (e.g., sharding, allreduce).
  • 2 years of experience with AI inference stacks like SGLang/vLLM and performance profiling.
  • Excellent problem-solving and communication skills; ability to work in a collaborative team environment.
  • 3 years of experience in Triton kernels and CUDA programming (basic understanding is acceptable but willingness to learn is essential).
  • Experience with AI accelerator hardware and embedded systems.
  • 3 years of prior work on efficient model checkpointing, resharding scripts, and large-scale model deployments for serving at scale.

Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Salary : $119,800 - $258,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior AI Software Architect?

Sign up to receive alerts about other jobs on the Senior AI Software Architect career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$117,871 - $153,580
Income Estimation: 
$131,745 - $167,716
Income Estimation: 
$144,503 - $184,592
Income Estimation: 
$102,541 - $137,871
Income Estimation: 
$153,752 - $200,235
Income Estimation: 
$103,114 - $138,258
Income Estimation: 
$118,163 - $145,996
Income Estimation: 
$120,777 - $151,022
Income Estimation: 
$129,363 - $167,316
Income Estimation: 
$86,891 - $130,303
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice Fargo, ND
  • Brand New Civil Engineer Opening With Leader In Land Development, Utilities and Drainage Design! This Jobot Job is hosted by: Brian Perkins Are you a fit? ... more
  • 13 Days Ago

  • Jobs via Dice Anchorage, AK
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Healthcare IT Leaders, is seeking the following. Apply... more
  • 13 Days Ago

  • Jobs via Dice Middletown, RI
  • Job ID: 2511899 Location: MIDDLETOWN, RI, US Date Posted: 2025-11-26 Category: Wage Determination (SCA) Subcategory: Service Contract Act Schedule: Full-ti... more
  • 13 Days Ago

  • Jobs via Dice Smithfield, RI
  • RESPONSIBILITIES: Kforce has a client that is seeking a 50/50 FS Java Angular AWS in Smithfield, RI. Responsibilities: Communicating technical needs and ca... more
  • 13 Days Ago


Not the job you're looking for? Here are some other Senior AI Software Architect jobs in the Redmond, WA area that may be a better fit.

  • Evertune AI Seattle, WA
  • PLEASE NOTE: At this time, we are only able to consider candidates who are authorized to work in the United States without the need for current or future v... more
  • 7 Days Ago

  • Read AI Seattle, WA
  • The Role: The Mobile Team’s mission is to capture and process high-value audio data that powers Read AI’s downstream applications - delivering smarter, rea... more
  • 5 Days Ago

AI Assistant is available now!

Feel free to start your new journey!