Demo

Senior Site Reliability Engineer

Themesoft Inc.
Irving, TX Contractor
POSTED ON 5/30/2026
AVAILABLE BEFORE 6/28/2026

POSITION DETAILS:


Role: Cloud SRE

Location: Charlotte, NC Irving, TX Chandler, AZ

Duration : 12 Months (Extension Converts or Direct Hire)

Hybrid

Pay rate: $70/hr on W2 Benefits

Seeking a senior engineer for L2/L3 application middleware production support with an SRE mindset (shift from reactive to proactive reliability) across VM and container-adjacent/OpenShift (OCP) environments. Role owns incident response, problem management, and runbook-driven ops, and drives observability, automation/IaC, compliance guardrails, and CI/CD-integrated operational automation to reduce toil and improve stability/MTTR.

Core responsibilities: L2/L3 escalation recovery; reliability signals & alert quality; blameless post-incident learning; logs/metrics/traces/dashboards actionable alerting; IaC/config-as-code; standardized automation (status/start/stop/restart); intelligent automation/AI-assisted ops with guardrails; drift/compliance checks remediation; CI/CD integration; runbooks & operational documentation.

seeking a Senior Systems Operations Engineer in technology as part of Consumer Lending Operations Technology. This role is focused on application and middleware production support with a Site Reliability Engineering (SRE) mindset—shifting from reactive operations to proactive reliability engineering through strong observability, automation, and continuous improvement. The position supports mission critical platforms across VM-based and container-adjacent environments, including OpenShift (OCP), and partners closely with application, middleware, infrastructure, network, and security teams to improve stability, reduce toil, and strengthen operational readiness. This includes hands-on ownership of incident response, problem management, and runbook-driven operations, while building automation and standardized patterns that make platform operations repeatable, auditable, and resilient. In this role, you will:

• Provide senior-level application and middleware support for complex, high-availability services; act as an escalation point for L2/L3 incidents; lead disciplined troubleshooting, recovery, and stabilization.

• Embed SRE practices into day-to-day operations: define reliability signals, improve alert quality, drive blameless post-incident learning, and prioritize systemic fixes and toil reduction.

• Implement and continuously improve observability across applications and middleware (logs, metrics, traces, dashboards, and actionable alerting) to improve detection, diagnosis, and MTTR.

• Design, develop, and maintain infrastructure-as-code and configuration-as-code capabilities supporting VM-based and container-adjacent workloads, including OpenShift (OCP) enablement.

• Build and support automation for operational actions across middleware components (standardized status checks, start/stop/restart patterns) to enable safer self-service and reduce dependency bottlenecks.

• Design and implement intelligent automation for platform and middleware operations, including integrating AI/agent-based approaches into workflows where appropriate (triage assistance, predictive signals, and automated remediation guardrails).

• Monitor configuration drift; support automated compliance checks; implement remediation patterns aligned to enterprise change management, security, and risk controls.

• Integrate infrastructure and operational automation with CI/CD pipelines to enable repeatable, auditable deployments and safer rollouts.

• Support core platform components that enable applications and container platforms, including ingress patterns, load balancing integration, and shared supporting services.

• Develop and maintain runbooks, operational documentation, and validation/testing approaches for automation and platform procedures to ensure operational readiness and consistent execution.

Required Qualifications

• 4 years of Systems Engineering or Technology Infrastructure/Operations Engineering experience, or equivalent demonstrated through work experience, training, military experience, or education. Desired Qualifications

• 4 years of application and/or middleware production support in complex, high-availability environments, including incident response and problem management with strong root cause discipline.

• 4 years of hands-on automation and configuration management experience (Ansible preferred or similar), plus strong scripting skills (Python, Bash, PowerShell, or similar).

• 4 years of Linux administration (RHEL preferred) and/or Windows Server administration supporting enterprise production workloads.

• 4 years of Git-based version control practices, including pull requests and peer review, with a focus on repeatability and code quality.

• Working experience with infrastructure-as-code concepts, including modular design and environment consistency.

• Experience supporting hybrid/private cloud platforms and container-adjacent hosting models; familiarity with OpenShift (OCP) or Kubernetes-based platforms.

• Experience implementing SRE operating practices (reliability metrics, reduction of manual toil, continuous improvement via post-incident learnings).

• Experience supporting common middleware platforms and shared services; ability to build automation patterns that standardize operational actions and reduce manual intervention.

• Familiarity with enterprise observability and operational support practices (service health dashboards, alert engineering, actionable telemetry).

• Exposure to responsible AI usage in operations (security, validation, accuracy, and appropriate guardrails for automation/agents).

• Strong cross-functional communication skills; experience operating in regulated environments.

Job Expectations

• Deliver assigned operational engineering and automation outcomes with a strong focus on stability, resiliency, and measurable toil reduction.

• Participate in on-call rotations and operational support coverage as required.

• Follow enterprise change management, risk, and compliance processes.

• Continuously improve platform reliability and automation maturity through standardization, documentation, and repeatable delivery.

• This position offers a hybrid work schedule.

• This position is not eligible for Visa sponsorship.

• Relocation assistance is not available for this position.

• Flexibility to work in a 24/7 environment, including weekends and holidays.

• Flexibility to frequently be on call beyond normal working hours.





Thanks,


Yuvi

Senior Technical Recruiter

Email: yuvi@themesoft.com

Web: www.themesoft.com


_______________________

Salary : $70

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Site Reliability Engineer?

Sign up to receive alerts about other jobs on the Senior Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Themesoft Inc.

  • Themesoft Inc. Charlotte, NC
  • Principal AI Engineer Hybrid - Charlotte, NC Contract - W2 Skillset: PL/SQL AWS Cloud Gitlab GIT Terraform Python Job Description/ Responsibilities: As a P... more
  • 4 Days Ago

  • Themesoft Inc. Columbus, OH
  • Role: Ansible and Terraform Engineer Location: Columbus, OH - 100% onsite Role Summary Looking for an automation-focused engineer to provision and configur... more
  • 4 Days Ago

  • Themesoft Inc. Columbus, OH
  • Role Name - Copilot Trainer REQUIREMENT_CITY - Columbus, OH ROLE_DESCRIPTION - Job Description: Microsoft 365 Copilot Trainer (AI Adoption & Enablement) Ro... more
  • 4 Days Ago

  • Themesoft Inc. Frederick, MD
  • Role Overview We are seeking a highly experienced Delphi Desktop Application Developer to support our client’s mission by delivering quality software withi... more
  • 5 Days Ago


Not the job you're looking for? Here are some other Senior Site Reliability Engineer jobs in the Irving, TX area that may be a better fit.

  • Jobs via Dice Coppell, TX
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Veloc Inc., is seeking the following. Apply via Dice t... more
  • 22 Days Ago

  • employerdirecthealthcare Dallas, TX
  • About Lantern Lantern is the specialty care platform connecting people with the best care when they need it most. By curating a Network of Excellence compr... more
  • 2 Months Ago

AI Assistant is available now!

Feel free to start your new journey!