Demo

Site Reliability Engineer (89322-1)

Jobs via Dice
Alpharetta, GA Full Time
POSTED ON 6/3/2026
AVAILABLE BEFORE 7/2/2026
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Key Business Solutions, Inc., is seeking the following. Apply via Dice today!

Site Reliability Engineer (89322-1)

Alpharetta, GA

12 Months

ROLE_DESCRIPTION -

Skill Set - Expertise in UNIX LINUX Administration AWS/ AZURE Cloud monitoring Terraform/ Ansible Promethe Grafana observability experience).

Work Location - Alpharetta

Experience required for role - 6 years

  • Production experience in SRE / Infrastructure / ops for large-scale systems
  • Strong programming/scripting skills (Python, Go, Java, or equivalent)
  • Deep experience with containerization (Docker), orchestration (Kubernetes, etc.)
  • Infrastructure-as-code (Terraform, Helm, CloudFormation, Ansible, etc.)
  • Familiarity with GPU / AI compute clusters, high-performance data storage, and distributed architectures
  • Experience with monitoring / observability / logging / alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)
  • Networking & systems engineering knowledge (TCP/IP, DNS, routing, load balancing, distributed storage)
  • Solid experience in capacity planning, performance tuning, scaling, and incident response
  • Demonstrated ability to lead RCAs, deploy fixes, and drive reliability improvements
  • Experience in regulated environments (financial services, compliance, audit, security) is a strong plus
  • Excellent communication, documentation, and cross-team collaboration skills
  • Proven track record of reducing operational toil via automation

Experience: 6 years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineering knowledge.

  • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)
  • Design and build automation for core platform capabilities, reducing manual toil
  • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.
  • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards
  • Lead incident response, root cause analysis (RCA), postmortems, and systemic remediation
  • Perform capacity planning, scaling strategies, workload scheduling, and resource forecasting
  • Optimize cost vs. performance tradeoffs in large-scale compute environments
  • Harden systems for security, compliance, auditability, and data governance
  • Collaborate across teams (cloud engineers, data engineers, infrastructure, security) to ensure safe deployment, rollout, rollback, and integration of new systems
  • Define disaster recovery (DR) strategies, backup/restore practices, fault tolerance mechanisms
  • Maintain runbooks, operational playbooks, documentation, and training materials
  • Participate in on-call rotations and respond to production incidents 24/7 as needed
  • Continuously evaluate and integrate new tools, frameworks, or technologies to enhance platform reliability

Skills: Digital : Python~Digital : Docker~Digital : Kubernetes~Digital : Site Reliability Engineering (SRE)

Experience Required: 6-8

Skills: Category Name Required Importance Experience

SkillCategoryTest1_MN Digital : Site Reliability Engineering (SRE) Yes 1 4-7 years

Salary.com Estimation for Site Reliability Engineer (89322-1) in Alpharetta, GA
$94,215 to $116,193
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Site Reliability Engineer (89322-1)?

Sign up to receive alerts about other jobs on the Site Reliability Engineer (89322-1) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice St Albans, VT
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Axiom Technologies LLC, is seeking the following. Appl... more
  • Just Posted

  • Jobs via Dice Middletown, RI
  • Job ID: T2600302 Location: Middletown, RI, US Date Posted: 2026-03-05 Category: Engineering and Sciences Subcategory: Electrical Engr Schedule: Full-Time S... more
  • Just Posted

  • Jobs via Dice Providence, RI
  • Role Overview We are seeking a customer-focused Desktop Support Technician to provide hands-on Windows 11 deskside support in a clinical/corporate environm... more
  • Just Posted

  • Jobs via Dice Providence, RI
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Cyma Systems Inc, is seeking the following. Apply via ... more
  • Just Posted


Not the job you're looking for? Here are some other Site Reliability Engineer (89322-1) jobs in the Alpharetta, GA area that may be a better fit.

  • Bright Vision Technologies Suwanee, GA
  • Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and ... more
  • 1 Day Ago

  • Carrier Kennesaw, GA
  • About Carrier Carrier Global Corporation, global leader in intelligent climate and energy solutions, is committed to creating innovations that bring comfor... more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!