Demo

Senior Site Reliability Engineer — Government & Sovereign Cloud

Veeam Software
San Jose, CA Full Time
POSTED ON 4/14/2026
AVAILABLE BEFORE 5/9/2026
Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enable the acceleration of safe AI at scale. As the market leader in both data resilience and data security posture management, Veeam is built for the convergence of identity, data, security, and AI risk. Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, who trust Veeam to keep their businesses running. Join us as we go fearlessly forward together, growing, learning, and making a real impact for some of the world’s biggest brands.

About The Role

Veeam is building a global SRE function to support the Veeam Data Cloud, our new SaaS platform. This role focuses on our Government and Sovereign Cloud environment.

Due to clearance and access requirements, this team operates with restricted access to GOV infrastructure. That means you'll be part of a small team responsible for the full platform stack — including all VDC workloads. You won't always be able to hand off problems to other teams; you need to understand the entire architecture well enough to own it. You'll need to get up to speed on the platform quickly, often by reading code, docs, and architecture artifacts rather than getting direct access to environments from day one.

This is a ground-up role — you'll help define how reliability engineering works here by mapping systems, writing runbooks, setting baselines, and building the practices this team will run on going forward.

What You'll Do

Discovery & Documentation

  • Get up to speed on the full platform — all VDC workloads, dependencies, and risk areas. Much of this will happen through code, docs, and conversations rather than direct environment access.
  • Work with SMEs across the org to fill knowledge gaps and build onboarding material for the team.
  • Write and maintain runbooks, architecture docs, and operational guides.

Reliability & Incident Response

  • Design infrastructure for high availability and fault tolerance on Azure (including Azure Government).
  • Define SLIs, SLOs, and error budgets where none exist today.
  • Run incident response and blameless postmortems. Turn incidents into improvements.
  • Identify reliability risks across modern and legacy workloads and build practical remediation plans that work within compliance constraints.

Observability

  • Close observability gaps — define instrumentation requirements and drive implementation.
  • Set alerting, telemetry, and monitoring standards with partner teams.
  • Build automation to reduce toil and support fleet management.
  • Participate in on-call rotations.

Infrastructure & Delivery

  • Work with IaC, CI/CD, deployment automation, and config management — including in air-gapped or compliance-restricted environments.
  • Build and maintain testing, canary deployment, and release validation pipelines.
  • Integrate chaos engineering and monitoring tools, adapting choices to meet regulatory requirements.

Collaboration

  • Work across product, platform, security, legal, compliance, and operations teams.
  • Own problems end-to-end — identify gaps, drive solutions, don't wait for direction.
  • Mentor other engineers and help spread SRE practices across the org.

Technologies we work with

  • Microsoft TFS, Azure DevOps, Git, BitBucket
  • Azure (Entra ID, API Management, Cosmos Db, Storage services, Azure Functions, static website hosting, Azure security, etc.) 
  • IaC tools (Azure ARM templates, AWS CloudFormation, Terraform, the Serverless Framework, etc.) 
  • Observability (Azure Monitor, AppInsights, Elastic Stack)

What You'll Bring

  • 7 years in Software Engineering, with 3 years in SRE, Platform Engineering, or similar — across multi-service platforms, not just single-service environments.
  • Experience with Government or Sovereign Cloud (e.g., Azure Government, AWS GovCloud).
  • Experience in regulated compliance environments — government (FedRAMP, CMMC, IL2/IL4/IL5), financial (PCI-DSS, SOX), or healthcare (HIPAA, HITRUST). You understand how compliance shapes architecture and operations.
  • Strong experience building and running production services on cloud infrastructure (Azure preferred, including Azure Government).
  • Able to learn large, complex platforms quickly with limited guidance — comfortable building understanding from code, docs, and architecture artifacts when direct environment access is restricted.
  • Can investigate systems independently and produce clear docs, risk assessments, and improvement plans.
  • Comfortable working across teams — engineering, product, security, compliance, operations.
  • Programming skills in one or more of: TypeScript/JS, Go, Java, C#, or similar.
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack).
  • Experience with IaC (Terraform, Terragrunt, Pulumi) and container orchestration (Kubernetes).
  • Experience with CI/CD and GitOps tooling — GitHub Actions, Azure DevOps, GitLab CI, ArgoCD, FluxCD, or Dagger.
  • Solid grasp of distributed systems, networking, and cloud-native architecture.
  • Clear written and verbal communication skills

Bonus Skills

  • Experience on B2B SaaS platforms in regulated or government markets.
  • Background in chaos engineering, resilience testing, or performance/load testing.
  • Have built an SRE or reliability function from scratch before.
  • Experience across mixed environments — modern cloud-native and older legacy systems.
  • Familiar with AI-first development workflows — using LLM-powered tools for infrastructure automation, code generation, and documentation.

Why Join?

  • Build the GOV reliability practice from day one — your decisions will shape how this team works.
  • Help define SRE at Veeam across a globally distributed engineering org.
  • Work with strong teams across product, cloud engineering, security, and compliance.
  • Professional development resources including mentorship, training, and volunteer days.
  • Competitive compensation and benefits.

What You'll Get

  • Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
  • Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
  • Medical, dental, and vision coverage starting on your first day
  • Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
  • 401(k) retirement plan with company matching contributions
  • Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
  • AirVet: 24/7 virtual veterinary care at no cost
  • Legal services, identity protection, and supplemental health insurance options
  • Tax-advantaged spending accounts for healthcare, dependent care, and commuting
  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

Compensation Transparency

Veeam is committed to pay transparency and equitable compensation. For this role, the compensation range below reflects the expected total target compensation (TTC), inclusive of base pay and a competitive performance-based bonus. For roles with a commission plan, the compensation range represents On Target Earnings (OTE), which includes base salary plus variable commission. When determining compensation, Veeam takes into consideration factors such as experience, education, skills, and geographic zone. Offers are typically made below the midpoint of the range.

In addition to compensation, Veeam provides a comprehensive benefits package, including health coverage, retirement plans, and unlimited time off.

U.S. Geographic Zones & Compensation Ranges (TTC / OTE)

Zone 1: San Francisco Bay Area, New York City Boroughs

$151,500—$252,500 USD

Zone 2: Washington, California (excluding San Francisco Bay Area)

$138,900—$231,400 USD

Zone 3: Texas, Illinois, North Carolina, Colorado, Massachusetts, Pennsylvania, Virginia, Oregon, Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia, Ohio, and Arizona

$126,300—$210,400 USD

Zone 4: All other US locations

$109,800—$183,000 USD

Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.

Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.

The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes.

By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.

By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from consideration for employment or, if discovered after employment begins, termination of employment.



Salary : $109,800 - $252,500

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Site Reliability Engineer — Government & Sovereign Cloud?

Sign up to receive alerts about other jobs on the Senior Site Reliability Engineer — Government & Sovereign Cloud career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Veeam Software

  • Veeam Software Boston, MA
  • Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides... more
  • 8 Days Ago

  • Veeam Software San Francisco, CA
  • Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enab... more
  • 8 Days Ago

  • Veeam Software San Francisco, CA
  • Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enab... more
  • 10 Days Ago

  • Veeam Software Des Moines, IA
  • Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enab... more
  • 13 Days Ago


Not the job you're looking for? Here are some other Senior Site Reliability Engineer — Government & Sovereign Cloud jobs in the San Jose, CA area that may be a better fit.

  • Hippocratic AI Palo Alto, CA
  • About Us Hippocratic AI is the leading generative AI company in healthcare. We have the only system that can have safe, autonomous, clinical conversations ... more
  • 10 Days Ago

  • TikTok USDS Joint Venture San Jose, CA
  • Responsibilities The Systems and Networking team is committed to ensuring the seamless operation of TikTok's US physical infrastructure. We handle the prov... more
  • 2 Days Ago

AI Assistant is available now!

Feel free to start your new journey!