What are the responsibilities and job description for the Site Reliability Engineer (SRE) – Security, Patching & Observability position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Georgia IT, is seeking the following. Apply via Dice today!
Job Title: Site Reliability Engineer (SRE) – Security, Patching & Observability
Location: Seattle, WA (4 Days Onsite)
Employment Type: Contract
Role Summary
We are looking for a hands-on Site Reliability Engineer (SRE) who will take ownership of server security, vulnerability management, and system observability. This role blends reliability engineering with proactive risk mitigation—ensuring infrastructure remains secure, compliant, and highly available.
You will play a key role in identifying vulnerabilities, driving remediation strategies, automating patching processes, and enhancing system visibility using modern observability tools.
Key Responsibilities
Security & Vulnerability Management
Job Title: Site Reliability Engineer (SRE) – Security, Patching & Observability
Location: Seattle, WA (4 Days Onsite)
Employment Type: Contract
Role Summary
We are looking for a hands-on Site Reliability Engineer (SRE) who will take ownership of server security, vulnerability management, and system observability. This role blends reliability engineering with proactive risk mitigation—ensuring infrastructure remains secure, compliant, and highly available.
You will play a key role in identifying vulnerabilities, driving remediation strategies, automating patching processes, and enhancing system visibility using modern observability tools.
Key Responsibilities
Security & Vulnerability Management
- Own and enhance enterprise-wide vulnerability management processes.
- Analyze and prioritize vulnerabilities across infrastructure, OS, and applications.
- Partner with security and engineering teams to drive timely remediation.
- Ensure compliance with internal policies and industry security standards.
- Utilize tools like Brinqa, Qualys, or similar platforms for risk aggregation and reporting.
- Plan and execute patching cycles for Windows and Linux servers.
- Track and improve patch compliance metrics (e.g., MTTP).
- Build automation scripts to streamline patching and reduce manual effort.
- Standardize patching processes to minimize downtime and operational risk.
- Design and maintain monitoring solutions using Datadog.
- Build dashboards, alerts, and metrics for proactive issue detection.
- Define and track SLIs, SLOs, and SLAs for critical services.
- Perform incident analysis and drive reliability improvements.
- Work closely with application, platform, and security teams.
- Participate in on-call rotations and incident response.
- Lead root cause analysis and post-incident reviews.
- Advocate for best practices in reliability, security, and scalability.
- Strong experience with Windows Server and Linux administration
- Hands-on expertise in vulnerability management & remediation
- Experience with tools like Brinqa, Qualys, or equivalent
- Proven experience with Datadog or similar observability platforms
- Solid knowledge of Microsoft Azure and/or on-prem infrastructure
- Experience with Docker and Kubernetes (K8s)
- Familiarity with CI/CD pipelines and GitOps (ArgoCD preferred)
- Strong scripting skills (Python, PowerShell, Bash)
- Understanding of networking fundamentals (TCP/IP, DNS, Load Balancing)
- Experience in incident management and production support