Demo

Senior Site Reliability Engineer

SDI International
Chicago, IL Full Time
POSTED ON 4/14/2026
AVAILABLE BEFORE 5/16/2026

No H1 or C2C. Must be Permanent Resident or US Citizen


Senior Site Reliability Engineer


Description and Requirements

About Our Team

We are building Quantum, a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this vision, we are expanding the reliability engineering organization that powers cross‑device Personal AI.

We are looking for Senior Site Reliability Engineers (SREs) to help us build and evolve the foundational reliability, observability, and operations capabilities that ensure fast, safe, and dependable for millions of users.

This role may support one of several teams within the SRE organization (e.g., Observability, Operations, or Service Reliability), depending on your strengths and interests.

Operating with the speed, ownership, and creative latitude of a startup—yet supported by the scale, resources, and technical depth. We are building new systems, new tooling, and new operational models from the ground up, and we are doing so with clarity, intention, and high engineering standards.




Location: Open to remote work in the US. The preferred work location is Chicago, IL.


What You Might Work On

As a Senior SRE, you may be responsible for a subset of the following, depending on team placement and skill alignment:

Reliability & Performance Engineering

  • Improving the availability, scalability, and performance of distributed systems across device, edge, and cloud.
  • Defining or refining SLIs, SLOs, and error budgets for critical services.
  • Leading initiatives to remove single points of failure, improve resilience, and reduce operational risk.

Operational Excellence

  • Participating in on‑call rotations and contributing to incident response, triage, and post-incident reviews.
  • Developing automation, runbooks, and self‑healing systems to reduce alert noise and MTTR.
  • Enhancing operational readiness and supporting incident prevention programs.

Observability & Insight

  • Designing or improving observability systems using OpenTelemetry, Grafana, and modern signal pipelines.
  • Building dashboards, analytics, and alerting that illuminate system health and AI service behavior.
  • Ensuring telemetry is reliable, actionable, and tied to real‑world outcomes.

Deployments & Change Safety

  • Improving reliability of CI/CD workflows, including phased rollouts, canaries, shadow testing, and safe rollback mechanisms.
  • Contributing to the evolution of deployment tooling for device edge cloud hybrid systems.

Systems Design & Collaboration

  • Influencing architectural decisions by injecting reliability, observability, and operational considerations early in design.
  • Collaborating with AI/ML engineers, platform engineers, firmware teams, and product partners to deliver robust, dependable user experiences.


Basic Qualifications

  • 10 years of experience in Site Reliability Engineering, Production Engineering, DevOps, or large‑scale distributed systems operations
  • Bachelor’s Degree in Computer Science, Engineering, or a related technical discipline
  • Strong experience running production distributed systems at scale
  • Proficiency in at least one modern programming language (e.g., Python, Go, Java, C )
  • Strong understanding of Linux systems, networking fundamentals, and system performance tuning
  • Experience with monitoring/observability (metrics, logs, tracing)
  • Hands‑on experience with cloud environments (Azure, AWS, or GCP)
  • Experience in incident management, on‑call rotations, and postmortem processes


Preferred Qualifications

  • Deep experience with Azure cloud services
  • Experience with OpenTelemetry for end‑to‑end instrumentation
  • Strong familiarity with Grafana, Prometheus, Loki, Tempo, or similar tools
  • Experience supporting AI/ML systems, model serving, or data‑intensive workloads
  • Background with hybrid architectures (device edge cloud)
  • Experience improving deployment reliability and progressive delivery systems
  • Passion for automation, reliability engineering, and reducing operational friction


What Success Looks Like

  • Systems become more observable, reliable, and predictable.
  • Incidents are resolved quickly, and follow‑up improvements prevent recurrence.
  • Alerting becomes more accurate, actionable, and trusted.
  • Deployments become safer and more consistent.
  • Teams move faster because reliability foundations are strong and intuitive.

Salary.com Estimation for Senior Site Reliability Engineer in Chicago, IL
$129,171 to $150,845
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Site Reliability Engineer?

Sign up to receive alerts about other jobs on the Senior Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Income Estimation: 
$169,957 - $202,398
Income Estimation: 
$151,875 - $212,356
Income Estimation: 
$120,143 - $165,703
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Not the job you're looking for? Here are some other Senior Site Reliability Engineer jobs in the Chicago, IL area that may be a better fit.

  • NJF Global Holdings Ltd Chicago, IL
  • We are partnering with a world-leading quantitative trading and technology firm to hire multiple Senior Site Reliability Engineers across Chicago and selec... more
  • 1 Month Ago

  • Motorola Solutions Chicago, IL
  • Company Overview At Motorola Solutions, we believe that everything starts with our people. We’re a global close-knit community, united by the relentless pu... more
  • 20 Days Ago

AI Assistant is available now!

Feel free to start your new journey!