Demo

Director of Site Reliability Engineering

Talently
Dallas, TX Full Time
POSTED ON 6/6/2026
AVAILABLE BEFORE 8/12/2026

Job Title: Director of Site Reliability Engineering

Location: Remote

Salary: $220,000-$225,000

Skills: Site Reliability Engineering, Distributed Systems, Google Cloud Platform, Team Leadership, Automation


About the Technology, Information and Media Company / The Opportunity:

Join a cutting-edge organization in the Technology, Information and Media industry at the forefront of delivering highly available, scalable, and secure platforms serving millions of users and handling significant transaction volumes. Our client is committed to innovation and operational excellence, offering the exciting opportunity to lead and build world-class Site Reliability Engineering practices. As Director of SRE, you will drive strategic reliability initiatives, shape engineering culture, and play a pivotal role in the organization’s continued growth and reliability journey while mentoring strong engineering teams in a remote-first environment.


Responsibilities:

  • Define and execute a comprehensive company-wide Site Reliability Engineering strategy, embedding reliability as a core discipline across engineering teams.
  • Build, lead, and develop a high-performing SRE organization, including hiring, mentoring, and fostering a reliability-focused culture.
  • Establish SLIs, SLOs, KPIs, and error budgets to measure and drive platform reliability and performance improvement.
  • Guide architecture decisions and technical roadmaps for highly available, resilient, and scalable distributed systems.
  • Drive adoption of observability, monitoring, logging, and incident response solutions across cloud-based microservices environments, primarily on Google Cloud Platform.
  • Establish and oversee robust incident response frameworks, operational governance, and post-incident analysis processes.
  • Promote and implement best practices for infrastructure automation, cloud-native operations, and cost optimization.
  • Lead continuous improvement and innovation initiatives, including exploring AI-driven operations and new SRE methodologies.


Must-Have Skills:

  • 12 years of experience in Site Reliability Engineering, Infrastructure Engineering, or DevOps in high-scale environments.
  • 5 years of proven technical leadership, building and scaling SRE teams and practices.
  • Strong expertise with distributed systems, cloud-native infrastructures, microservices, and hands-on Google Cloud Platform experience (GKE, Compute Engine, Cloud Functions).
  • Deep proficiency with infrastructure as code, automation frameworks, and CI/CD deployment pipelines.
  • Track record designing large-scale observability and monitoring solutions using tools like Prometheus, Grafana, Datadog, or New Relic.
  • Excellent communication, organizational development, and mentorship abilities.
  • Strong programming ability in Python, Go, Java, or similar languages.


Nice-to-Have Skills:

  • Cloud or reliability certifications (e.g., Google Cloud Professional, SRE certifications).
  • Experience implementing AIOps, anomaly detection, predictive analytics, or automated remediation/self-healing infrastructure.
  • Familiarity with AI/ML tools for operational intelligence and intelligent alerting.
  • Strong database performance tuning and distributed data systems knowledge.
  • Comfortable operating in fast-paced, high-growth technology environments.
  • Bachelor’s degree in Computer Science, Engineering, or related field.

Salary : $220,000 - $225,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Talently

  • Talently Staunton, VA
  • Job Title: Leasing Agent / Assistant Property Manager Location: On Site - Staunton, Virginia, United States Salary: $40,000-$47,000 Skills: Prior experienc... more
  • 1 Day Ago

  • Talently Portland, ME
  • Job Title: Construction Project Manager Location: Greater Portland, ME Hybrid: Onsite 1x/week Salary: $130,000-$150,000 Annual Bonus Skills: Multi-project ... more
  • 1 Day Ago

  • Talently Minneapolis, MN
  • Job Title: Associate General Counsel Location: Hybrid – Minneapolis Salary: $180K–$250K Skills: Contract Negotiation, Compliance, Risk Management, Real Est... more
  • 1 Day Ago

  • Talently Port Chester, NY
  • Job Title : Property Accountant / Sr. Property Accountant Location : Port Chester, NY ( 4 days on site, 1 day remote ) Salary : $80,000 - $120,000 Skills :... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Director of Site Reliability Engineering jobs in the Dallas, TX area that may be a better fit.

  • Affirm Dallas, TX
  • Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or comp... more
  • 21 Days Ago

  • Forhyre Plano, TX
  • Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to imp... more
  • 26 Days Ago

AI Assistant is available now!

Feel free to start your new journey!