Demo

Site Reliability Engineer

iO Associates
Houston, TX Full Time
POSTED ON 5/22/2026
AVAILABLE BEFORE 6/18/2026

Senior Site Reliability Engineer

iO Assocciates are supporting this growing f/s organization that specialises in building resilient, cloud-native platforms that set the standard for operational excellence. Renowned for their innovative approach and commitment to growth, they foster a collaborative environment that values technical excellence, continuous learning, and impactful contributions.

Role Overview

This pivotal role has been created to support a strategic expansion of their reliability capabilities. As a Senior Site Reliability Engineer, you will be instrumental in enhancing the stability, availability, and scalability of vital systems. Your expertise will directly influence the resilience of mission-critical infrastructure, enabling the organisation to deliver seamless services and uphold their reputation for operational excellence.

Key Responsibilities

  • Lead initiatives to define and refine service reliability metrics, error budgets, and operational best practices
  • Architect and implement comprehensive monitoring, logging, and alerting solutions to improve incident detection and response
  • Drive incident management processes, including structured post-incident reviews and preventative measures
  • Collaborate with development teams to optimise performance, scalability, and disaster recovery strategies
  • Design and maintain highly available, fault-tolerant cloud architectures across multiple regions
  • Develop and maintain Infrastructure as Code modules and CI/CD pipelines to ensure consistent, automated deployment processes
  • Participate in defining recovery objectives (RTO/RPO) and orchestrate regular disaster recovery testing exercises
  • Produce documentation, runbooks, and reports to support ongoing operational improvements

Essential Skills & Experience

  • A minimum of 7 years' experience in Site Reliability Engineering, DevOps, or related fields supporting production environments
  • Proven expertise with cloud platforms, especially AWS, with significant hands-on experience managing mission-critical systems
  • Strong knowledge of Infrastructure as Code tools such as Terraform or CloudFormation/CDK
  • Demonstrable experience with CI/CD pipelines and automation frameworks, including version-controlled infrastructure
  • Skilled in designing resilient cloud architectures and implementing BCP/DR plans with structured testing
  • Proficiency with monitoring tools such as Datadog, Prometheus, Grafana, or ELK/OpenSearch
  • Solid Linux fundamentals, networking knowledge (TCP/IP, DNS, load balancing), and troubleshooting skills
  • Practical scripting experience in languages such as Python, Bash, or Go
  • Excellent technical documentation and communication skills, with the ability to create clear runbooks and reports

Desirable Qualifications & Skills

  • Familiarity with additional-cloud providers like Azure or Oracle Cloud
  • Experience with container orchestration platforms such as Kubernetes, EKS, or ECS
  • Knowledge of advanced observability tools like OpenTelemetry and distributed tracing
  • Certification achievements such as AWS Solutions Architect, DevOps Engineer, or Kubernetes CKA/CKAD
  • Experience with progressive delivery techniques, including blue/green and canary deployments, as well as automation in rollback strategies

Salary.com Estimation for Site Reliability Engineer in Houston, TX
$73,762 to $96,335
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at iO Associates

  • iO Associates Tampa, FL
  • Join Our Client as an Epicor Solutions Architect Are you an experienced Epicor ERP professional with a passion for designing practical, high-quality soluti... more
  • 15 Days Ago

  • iO Associates Tampa, FL
  • Join Our Client as an Agentic AI Developer Our Client is a leading innovator in the healthcare technology sector, dedicated to revolutionising patient outc... more
  • 1 Day Ago

  • iO Associates Houston, TX
  • Senior Design Engineer Location: Onsite Industry: Oil & Gas / Drilling Technology Employment Type: Full-Time Overview We are seeking an experienced Senior ... more
  • 5 Days Ago

  • iO Associates California, CA
  • Job Title: Global Integrated Campaigns Manager (B2B Technology) - Contract - Hybrid Organisation Overview: Our Client operates within the fast-paced and dy... more
  • 7 Days Ago


Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Houston, TX area that may be a better fit.

  • Tekmetric Houston, TX
  • About Tekmetric Tekmetric is the all-in-one, cloud-based platform helping auto repair shops run smarter, grow faster, and serve customers better. Built by ... more
  • 15 Days Ago

  • IBM Houston, TX
  • Introduction At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make mark... more
  • 7 Days Ago

AI Assistant is available now!

Feel free to start your new journey!