What are the responsibilities and job description for the Site Reliability Engineer position at PANGEATWO?

Site Reliability Engineer (SRE)Hybrid Opportunity | Enterprise Cloud Environment

Growing enterprise technology organization is seeking an experienced Site Reliability Engineer (SRE) to support large-scale distributed applications and cloud-based transformation initiatives. This is a high-impact role focused on improving system reliability, scalability, automation, and operational resilience across mission-critical platforms.

The ideal candidate will bring a strong combination of software engineering, infrastructure, cloud, and operational support experience with a passion for automation, monitoring, and continuous improvement.

Candidates must be authorized to work in the U.S. without sponsorship.

Key Responsibilities

Monitor system performance, availability, and reliability across enterprise platforms
Gather and analyze operational metrics to improve fault tolerance and scalability
Partner closely with development teams to improve deployment, testing, and release processes
Support cloud-based transformation initiatives and platform modernization efforts
Participate in system architecture, platform management, and capacity planning activities
Drive automation initiatives to reduce manual intervention and improve system resilience
Troubleshoot infrastructure, application, network, and performance-related issues
Respond to incidents and support restoration of critical services
Implement proactive monitoring, alerting, and observability improvements
Support continuous improvement initiatives focused on performance, reliability, and operational excellence

Required Qualifications

Bachelor’s degree or equivalent experience
5 years of experience in Site Reliability Engineering, DevOps, Systems Engineering, or related areas
Strong understanding of Kubernetes, containers, clustering, and elastic scalability
Experience supporting cloud environments, preferably Google Cloud Platform (GCP)
Experience with microservices and API/service-based architectures
Strong troubleshooting skills across infrastructure, networking, databases, operating systems, and security
Architecture-level knowledge of Linux and Windows systems
Experience with CI/CD pipelines and deployment automation
Experience supporting enterprise production environments and monitoring platforms

Technical Environment

Experience with the following technologies is highly preferred:

Kubernetes
Google Cloud Platform (GCP)
Terraform
Prometheus
Grafana
Dynatrace
Azure DevOps (ADO)
CI/CD tools and automation frameworks
HTTP, proxies, and modern web technologies

Ideal Candidate

Passionate about scalability, stability, and system performance
Proactive problem-solver with strong analytical abilities
Comfortable working in fast-paced, high-availability environments
Strong collaborator who works effectively across engineering and operations teams
Focused on automation, innovation, and continuous improvement

This is an excellent opportunity to join a forward-thinking technology environment where reliability engineering plays a strategic role in platform growth and modernization.

Apply for this job

Receive alerts for other Site Reliability Engineer job openings

Site Reliability Engineer

What are the responsibilities and job description for the Site Reliability Engineer position at PANGEATWO?

Job openings at PANGEATWO

Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Irondale, AL area that may be a better fit.

We don't have any other Site Reliability Engineer jobs in the Irondale, AL area right now.

AI Assistant is available now!