What are the responsibilities and job description for the Site Reliability Engineer position at The Phoenix Group?
Overview
Our client is looking for a skilled Site Reliability Engineer to take ownership of platform reliability and operational performance. In this role, you’ll lead incident response efforts, enhance observability, and ensure uptime across our AWS cloud environment. You’ll combine strong technical acumen with a proactive mindset—focusing on automation, scalability, and operational excellence through Infrastructure as Code (IaC).
Key Responsibilities
- Lead reliability initiatives, SLO tracking, and incident response across critical systems.
- Define and enhance monitoring and observability strategies.
- Manage and optimize AWS infrastructure, including ECS, Fargate, Lambda, and IAM.
- Build and maintain Infrastructure as Code using tools such as Pulumi or Terraform.
- Automate operational workflows using Python or Bash.
- Support containerized deployments (Docker) and contribute to Kubernetes-based initiatives.
- Maintain and troubleshoot Linux-based systems.
- Ensure cloud environments adhere to best practices in security, performance, and cost management.
- Support data-intensive, AI-driven, or real-time processing workloads.
- Keep documentation current for operational processes, playbooks, and infrastructure design.
Qualifications
- Bachelor’s degree in Computer Science, Information Technology, or equivalent hands-on experience.
- 3 years in Site Reliability, DevOps, or Systems Engineering roles.
- Experience managing production on-call rotations and incident escalations.
- Deep knowledge of Linux systems, networking, and cloud infrastructure.
- Hands-on experience with AWS core services (ECS, Fargate, Lambda, IAM).
- Proficiency with Infrastructure as Code frameworks (Pulumi, Terraform, or AWS CDK).
- Familiarity with observability tools like Datadog, Prometheus, New Relic, or Sentry.
- Proficient scripting skills in Python or Bash.
- Solid understanding of containerization and CI/CD deployment workflows.
- Strong communication and documentation abilities.
Preferred Skills
- Experience supporting highly available, real-time, or AI/ML production environments.
- Familiarity with Kubernetes or other orchestration technologies.
- AWS or DevOps-related certifications are a plus.
The Phoenix Group Advisors is an equal opportunity employer. We are committed to creating a diverse and inclusive workplace and prohibit discrimination and harassment of any kind based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We strive to attract talented individuals from all backgrounds and provide equal employment opportunities to all employees and applicants for employment.
Salary : $125,000 - $150,000