Demo

SRE/MQ Infrastructure Lead

Stash Talent Services
Plano, TX Full Time
POSTED ON 6/10/2026 CLOSED ON 6/13/2026

What are the responsibilities and job description for the SRE/MQ Infrastructure Lead position at Stash Talent Services?

Position: SRE/MQ Infrastructure Lead

Location: Plano, TX (3 days in office, 2 days remote)

Duration: 12-month contract (ext. up to 36 months)


Job Details:

We are seeking an experienced Site Reliability Engineer (SRE)/ MQ Infrastructure Lead for Messaging Services to drive platform reliability, observability, and operational excellence across IBM MQ and Kafka environments. This role combines production engineering and reliability leadership, platform security and resilience engineering, and ownership of large-scale, distributed messaging runtimes. The position has a hybrid schedule requiring a minimum of 3 days per week on-site.


Responsibilities:

  • Leading reliability engineering for high-scale messaging platforms supporting tens of thousands of runtimes and high-volume message throughput
  • Driving EOL remediation, patching, and stabilization across MQ queue managers and Kafka clusters
  • Implementing SRE best practices such as SLIs / SLOs focused on message delivery, latency, and availability, and incident management, escalation, and postmortem culture
  • Enhancing observability and monitoring for messaging flows, queue depths, lag, and throughput
  • Designing proactive fault detection and auto-remediation strategies (e.g., DLQ handling, backlog mitigation, failover recovery)
  • Building resilient messaging platforms capable of supporting real-time, event-driven workloads
  • Supporting global production messaging environments with on-call rotation and escalation ownership
  • Partnering with engineering, application, and security teams to ensure reliability, scalability, and secure message transport


Requirements:

  • Strong experience in Site Reliability Engineering / Production Engineering
  • Hands-on expertise with IBM MQ (queue managers, clustering, channels, DLQ management), Kafka / Confluent platform (topics, brokers, partitions, consumer groups), and large-scale distributed messaging systems and runtime management
  • Deep understanding of system reliability, scalability, and high availability design; messaging reliability patterns (guaranteed delivery, retry handling, replay, ordering); and incident management, root cause analysis, and problem management
  • Experience with observability tools (Dynatrace, Splunk, Prometheus, Grafana) for messaging platforms and event and anomaly detection in high-volume systems
  • Strong scripting/automation skills in Shell, Python, PowerShell
  • Experience managing Linux/Unix and Windows production environments
  • Knowledge of event-driven architecture and messaging-based integration patterns
  • Understanding of messaging platform security (TLS, certificates, channel auth, encryption) and vulnerability remediation and risk mitigation in production systems
  • Excellent troubleshooting skills in high-pressure, real-time environments (e.g., message backlog, latency spikes, connection failures)


Desired skills:

  • Experience implementing SRE frameworks (SLIs, SLOs, error budgets) specifically for messaging workloads
  • Familiarity with Kubernetes / containerized messaging platforms
  • Experience with Kafka ecosystem components (Schema Registry, Connect, Streams) and IBM MQ advanced features (Native HA, clustering)
  • Exposure to AI-driven operations (AIOps), anomaly detection, or automated remediation and large-scale messaging modernization or migration programs
  • Messaging or middleware certifications (IBM MQ, Kafka, or equivalent)
  • Experience in regulated environments (e.g., financial services)

Salary : $60

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a SRE/MQ Infrastructure Lead?

Sign up to receive alerts about other jobs on the SRE/MQ Infrastructure Lead career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Income Estimation: 
$154,184 - $199,940
Income Estimation: 
$189,563 - $242,917
This job has expired.
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Stash Talent Services

  • Stash Talent Services York, NY
  • Title: Job Title: Senior Project/Change Management Consultant Location: 4 days onsite NYC – no exceptions Duration: 12 months Job Summary: Are you an exper... more
  • 1 Day Ago

  • Stash Talent Services Maui County, HI
  • Job Title: Restaurant General Manager Location: Onsite - Maui, HI Duration: 12 Months Position Summary We are seeking an experienced General Manager to ove... more
  • 2 Days Ago

  • Stash Talent Services Pennington, NJ
  • Position: Sr. VDI/ M365 Platform Engineer Location : Charlotte, NC, Plano TX, Pennington, NJ (3 days in office, 2 days remote) Duration: 12-month contract ... more
  • 4 Days Ago


Not the job you're looking for? Here are some other SRE/MQ Infrastructure Lead jobs in the Plano, TX area that may be a better fit.

  • Jobs via Dice Plano, TX
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Apex Systems, is seeking the following. Apply via Dice... more
  • 5 Days Ago

  • JPMorgan Chase Plano, TX
  • We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible. As a Lead Infrastructure Engineer a... more
  • 7 Days Ago

AI Assistant is available now!

Feel free to start your new journey!