What are the responsibilities and job description for the DevOps Engineer position at Scale.jobs?
About The Role
The role focuses on building and maintaining a resilient cloud infrastructure that supports high-traffic production environments. This includes owning the lifecycle of Kubernetes clusters, CI/CD pipelines, and automated monitoring systems to ensure maximum uptime and performance.
The team works at the intersection of software engineering and systems operations, applying a software-defined approach to infrastructure. The primary goal is to empower product teams by providing self-service platforms that are secure, scalable, and highly observable.
Key Responsibilities
The role focuses on building and maintaining a resilient cloud infrastructure that supports high-traffic production environments. This includes owning the lifecycle of Kubernetes clusters, CI/CD pipelines, and automated monitoring systems to ensure maximum uptime and performance.
The team works at the intersection of software engineering and systems operations, applying a software-defined approach to infrastructure. The primary goal is to empower product teams by providing self-service platforms that are secure, scalable, and highly observable.
Key Responsibilities
- Manage and scale containerized environments using Kubernetes, including cluster upgrades, resource optimization, and security hardening
- Design and implement Infrastructure as Code (IaC) modules using Terraform or Pulumi to ensure reproducible and consistent environments across AWS or GCP
- Build and maintain automated CI/CD pipelines using GitHub Actions, GitLab CI, or Jenkins to streamline the path from code commit to production
- Implement comprehensive observability stacks using Prometheus, Grafana, and ELK/Datadog to detect and resolve system bottlenecks before they impact users
- Participate in a blameless on-call rotation and lead post-mortem incident reports to improve system reliability and reduce mean time to recovery (MTTR)
- Collaborate with security teams to integrate DevSecOps practices, managing secrets, IAM policies, and vulnerability scanning throughout the build process
- 3–6 years of experience in a DevOps, SRE, or Platform Engineering role managing production-grade cloud infrastructure
- Expertise in container orchestration with Kubernetes and a deep understanding of networking, ingress controllers, and service meshes
- Strong proficiency in scripting and automation using Python, Go, or Bash for systems engineering tasks
- Hands-on experience with Infrastructure as Code tools, specifically Terraform, and managing state at scale across multiple environments
- Solid understanding of cloud-native architecture on AWS, GCP, or Azure, including VPCs, IAM, and managed database services
- Bonus: Experience with service mesh technologies (Istio/Linkerd), SOC2 compliance audits, or managing large-scale PostgreSQL/NoSQL migrations