What are the responsibilities and job description for the Site Reliability Engineer (SRE) - Healthcare Exp position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Cogent Data Solutions Llc, is seeking the following. Apply via Dice today!
Role: Senior Site Reliability Engineer (SRE) Location: Austin, TX Hybrid
Job Description:
Role: Senior Site Reliability Engineer (SRE) Location: Austin, TX Hybrid
Job Description:
- 8 years (Required) of experience in Systems Engineering, DevOps, or Site Reliability Engineering (SRE) roles
- 8 years (Required) of strong experience with Linux/Unix systems and understanding of system internals
- 8 years (Required) of proficiency in one or more programming/scripting languages such as Python, Go, Java, or Bash
- 8 years (Required) of experience designing and operating highly available, distributed systems
- 8 years (Required) of strong knowledge of cloud platforms (AWS or Google Cloud Platform) and cloud-native services
- 8 years (Required) of hands-on experience with containerization and orchestration technologies such as Docker and Kubernetes
- 8 years (Required) of strong understanding of monitoring, alerting, and logging concepts in production environments
- 8 years (Required) of experience defining and managing SLIs, SLOs, and error budgets
- 8 years (Required) of familiarity with incident management, root cause analysis (RCA), and postmortems
- 8 years (Required) of experience integrating security and compliance into operational and DevOps workflows
- 4 years (Preferred) of experience with observability tools such as Prometheus, Grafana, Application Insights, Datadog, or Splunk
- 4 years (Preferred) of experience operating 247 production environments, including on-call rotations
- 4 years (Preferred) of experience with chaos engineering and resiliency testing
- 4 years (Preferred) of experience implementing feature flags, canary deployments, and progressive delivery
- 4 years (Preferred) of strong documentation skills for runbooks, dashboards, and operational standards