What are the responsibilities and job description for the lead SRE / DevOps Engineer position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Data Wave Technologies Inc, is seeking the following. Apply via Dice today!
Job Title: lead SRE / DevOps Engineer
Location: Pittsburgh, PA (Hybrid )
Duration: Long term
Tax Term:- W2
Job Summary
We are looking for a Lead Site Reliability Engineer (SRE) / DevOps Engineer to design, build, and maintain highly scalable, reliable, and secure infrastructure systems.
This role focuses on automation, observability, incident management, and cloud-native architecture, while also leading engineering best practices across teams.
Key Responsibilities
Core Technologies
Job Title: lead SRE / DevOps Engineer
Location: Pittsburgh, PA (Hybrid )
Duration: Long term
Tax Term:- W2
Job Summary
We are looking for a Lead Site Reliability Engineer (SRE) / DevOps Engineer to design, build, and maintain highly scalable, reliable, and secure infrastructure systems.
This role focuses on automation, observability, incident management, and cloud-native architecture, while also leading engineering best practices across teams.
Key Responsibilities
- Lead the design and implementation of scalable cloud infrastructure
- Drive automation (IaC, CI/CD pipelines) to reduce manual effort
- Own system reliability, uptime, and performance optimization
- Implement monitoring, alerting, and observability solutions
- Manage incident response, root cause analysis, and post-mortems
- Collaborate with development, operations, and security teams
- Define and track SLIs/SLOs and system health metrics
- Mentor engineers and promote DevOps/SRE best practices
Core Technologies
- Cloud platforms: AWS / Azure / Google Cloud Platform
- Containers & orchestration: Docker, Kubernetes
- Infrastructure as Code: Terraform, CloudFormation
- CI/CD tools: Jenkins, GitLab CI, GitHub Actions
- Programming: Python, Bash
- Automation & configuration: Ansible, Chef, Puppet
- Tools: Prometheus, Grafana, ELK stack, Dynatrace
- Microservices & distributed systems
- High-availability and fault-tolerant system design
- 5 10 years in DevOps / SRE / Infrastructure roles
- Proven experience with:
- Large-scale distributed systems
- Cloud infrastructure automation
- Incident management & production support
- Experience working in Agile environments
- Leadership or mentoring experience preferred
- Bachelor's/Master's in Computer Science or related field
- Preferred certifications:
- AWS / Azure / Google Cloud Platform certifications
- Kubernetes / Terraform certifications