Demo

Site Reliability Engineer

Infinite Computer Solutions (ICS)
Alpharetta, GA Full Time
POSTED ON 5/22/2026
AVAILABLE BEFORE 6/20/2026
Job Title: SRE Cloud Operations (Digital Banking Platform)
Location: Alpharetta, Atlanta GA || Onsite
Type: Full Time/W2 with Infinite Computer Solutions
Role Summary
As a Site Reliability Engineer (SRE) Cloud Operations, you will provide operational ownership, reliability engineering, and cloud operations support for a Digital Banking platform running across Windows Server, IIS, and Microsoft Azure. This role focuses on ensuring availability, performance, security, and scalability for customer-facing digital banking workloads.
You will be part of a team delivering 24x7 production support across Windows Server (2016/2019/2022) and Azure (Prod/DR) environments, working closely with Application Development, DevOps, Infrastructure, Network, and Security teams to operate at scale using SRE principles.
Key Responsibilities
Reliability Engineering & Cloud Operations
Provide operational ownership for Digital Banking applications hosted on Windows Server / IIS across Azure and on-prem environments.
Apply SRE principles to improve service reliability, availability, and performance.
Define and execute operational best practices around stability, resiliency, and controlled change.
Support high-availability and disaster-recovery architectures across production and DR environments.
Monitoring, Observability & Incident Response (Core Focus)
Monitor platform and application health using Dynatrace and Splunk.
Perform advanced diagnostics using Windows Event Logs, PerfMon, and Azure Monitor.
Lead and participate in P1/P2 incident response, including bridge calls, real-time troubleshooting, and coordination across multiple teams.
Drive root cause analysis (RCA) and implement preventive and corrective actions.
Track and reduce operational toil through automation and engineering improvements.
Application & Platform Operations
Support application deployments, hotfixes, and production releases with a strong focus on safety and repeatability.
Manage SSL/TLS certificate lifecycle management, including renewals and configuration across IIS and load balancers.
Execute and coordinate OS and application patching using WSSCCM and cloud tooling.
Support and optimize F5 / ADC load-balanced environments.
Security, Compliance & Governance
Enforce security and compliance controls including RBAC, least-privilege access, encryption in transit and at rest, Active Directory, GPOs, service accounts, and secrets management.
Support audits, risk reviews, and control evidence collection.
Automation, CI/CD & Engineering Enablement
Build and maintain automation using PowerShell, DSC, and Ansible.
Partner with DevOps and AppDev teams to support CI/CD pipelines (Azure DevOps, GitHub Actions, Jenkins) for Windows/IIS workloads.
Improve deployment reliability, rollback strategies, and operational guardrails.
Contribute to platform designs supporting blue/green, canary, and zero-downtime deployments where applicable.
Required Qualifications (Must-Have)
Strong hands-on experience administering Windows Server (2016/2019/2022) in production environments.
Strong hands-on experience with IIS, including site configuration, application pools, bindings, performance tuning, and troubleshooting.
Hands-on, production experience with Dynatrace for application and infrastructure monitoring.
Hands-on, production experience with Splunk for log analysis, queries, dashboards, and troubleshooting.
Experience diagnosing system and application issues using Windows Event Logs and PerfMon.
Experience supporting high-severity production incidents, including ownership during incident bridges.
Working knowledge of TCP/IP, HTTP/S, TLS, and integrations with load balancers, WAFs, and reverse proxies.
Experience managing deployments, patching, SSL/TLS certificates, and formal change management processes.
Strong PowerShell scripting and automation experience; exposure to DSC and/or Ansible.
Experience operating workloads in Azure, including production and DR environments.
Working knowledge of Active Directory, GPOs, service accounts, and PKI/certificate management.
Bachelor s degree in Computer Science, Information Technology, or equivalent practical experience.

Salary.com Estimation for Site Reliability Engineer in Alpharetta, GA
$72,167 to $94,252
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Infinite Computer Solutions (ICS)

  • Infinite Computer Solutions (ICS) Berkeley, NJ
  • Job Description: Job Title: Implementation Analyst Location: Berkeley Heights, NJ (100% Onsite M-F) Type: Full Time/W2 with Infinite Computer Solutions Wha... more
  • 16 Days Ago

  • Infinite Computer Solutions (ICS) Alpharetta, GA
  • Role: BI /Data Architect (Business Intelligence Architect) About the Role The Data & Reporting Strategy (DRS) BI /Data Architect provides business intellig... more
  • 16 Days Ago

  • Infinite Computer Solutions (ICS) Columbus, OH
  • We're seeking for an ServiceNow Developer/Administrator for our direct client. Please review the below job Description and revert with your interest for th... more
  • 1 Day Ago

  • Infinite Computer Solutions (ICS) San Diego, CA
  • Location: San Diego, CA Can this be remote: No. End client: Confidential Job Title: SAP S/4HANA Service Management Consultant Required Skills & Responsibil... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Alpharetta, GA area that may be a better fit.

  • LoadUp Alpharetta, GA
  • Who We Are LoadUp is a fast-growing company that provides a transparent and convenient solution to on-demand home services through our custom tech-enabled ... more
  • 1 Day Ago

  • Cyberobotix Atlanta, GA
  • Job Title: Senior Site Reliability Engineer (SRE) Location: Atlanta, GA(Hybrid) Duration: Long Term Work Mode: A hybrid work schedule will be followed wher... more
  • 2 Days Ago

AI Assistant is available now!

Feel free to start your new journey!