Demo

SRE/Platform Engineer (OpenShift/Kubernetes) 4660

Tier4 Group
Washington, WA Full Time
POSTED ON 4/25/2026
AVAILABLE BEFORE 11/27/2026

Site Reliability Engineer (SRE) / Platform Engineer

Location: Reston, VA (Hybrid — 2 days onsite / 3 days remote)

Employment Type: Full-time


About the Organization

Join a mission-driven, national financial services organization at the heart of the U.S. housing finance ecosystem. This is a mid-sized, highly regulated enterprise operating at market scale—supporting platforms and analytics that enable trillions of dollars in annual economic activity. You’ll work in a modern tech environment with strong engineering partners, clear business impact, and a mandate for reliability, security, and continuous improvement.


The Role

Our client is hiring a hands-on SRE / Platform Engineer to operate, tune, and scale our OpenShift/Kubernetes platforms while bridging on-prem to Azure to power our analytics ecosystem. You’ll own reliability, automation, and observability across a hybrid estate—partnering closely with developers, data engineers, infrastructure operations, and security to deliver secure, performant platform services using modern DevSecOps practices.


Why This Role Stands Out

  • Hybrid impact: Operate critical OpenShift clusters and manage Azure services used by data and analytics teams.
  • Hybrid architecture: Help design and support the bridge from on-prem to cloud—migration, integration, and steady-state operations.
  • Real-world scale: Reliability work that directly supports high-volume financial market operations and enterprise analytics.
  • Automation-first: Lean into Terraform, Ansible, and GitOps to make reliability repeatable.


What You’ll Do the First 180 Days...

  • Operate, tune, and optimize OpenShift/Kubernetes clusters (scheduling, ingress, upgrades, quotas, policies).
  • Stand up and/or refine observability (Datadog, Prometheus, Grafana)—dashboards, alerts, SLOs, runbooks.
  • Map current hybrid topology and critical delivery pipelines; identify toil and prioritize automation (Terraform/Ansible).
  • Begin supporting Azure environments (compute, networking, storage, data services) used by analytics teams.
  • Drive GitOps-first workflows; harden CI/CD with ArgoCD/Jenkins/GitHub Actions and policy-as-code guardrails.
  • Implement or enhance platform services (Vault, Kafka/AMQ, ingress, service mesh) for dev and data teams.
  • Lead incident response and postmortems; institutionalize RCA, blameless learning, and continuous improvement.
  • Advance the hybrid service model—migrations, integrations, reliability/latency tuning, cost and performance optimization.


Day-to-Day Responsibilities

  • Operate and optimize OpenShift/Kubernetes clusters, ingress (e.g., Nginx), and container networking/service mesh.
  • Manage Azure services (compute, VNet, storage, data services) supporting analytics workloads.
  • Build and maintain automated infrastructure with Terraform, Ansible, and GitOps workflows.
  • Implement and evolve observability (Datadog, Prometheus, Grafana): metrics, traces, logs, alerting, SLOs, runbooks.
  • Design, harden, and support delivery pipelines with ArgoCD/Jenkins/GitHub Actions.
  • Provide platform tooling and enablement for application developers, data engineers, and operations teams.
  • Ensure security and access management (HashiCorp Vault, secrets management, least privilege).
  • Lead incident response, coordinate cross-functional resolution, and drive corrective actions and platform improvements.
  • Script or develop tools in Bash, Python, or Go to eliminate toil and improve developer experience.


Tech You’ll Work With

  • Kubernetes / OpenShift
  • Azure (compute, networking, storage, and data services)
  • Automation & IaC: Terraform, Ansible, GitOps
  • Observability: Datadog, Prometheus, Grafana
  • Networking & Ingress: Nginx, service meshes, container networking
  • Messaging: Kafka, AMQ
  • Secrets & Access: HashiCorp Vault
  • CI/CD: ArgoCD, Jenkins, GitHub Actions
  • Scripting/Coding: Bash, Python, Go


Must-Have Qualifications

  • 2 years hands-on operating and managing Kubernetes and OpenShift clusters.
  • Strong experience with Microsoft Azure (compute, networking, storage, and data services).
  • Proven skills in automation and Infrastructure-as-Code (Terraform, Ansible, GitOps).
  • Proficiency with observability tooling (Datadog, Prometheus, Grafana).
  • Scripting/coding ability in Bash, Python, or Go.


Preferred / Stand-Out Skills

  • Experience bridging on-prem and cloud in a hybrid service model (migration, integration, optimization).
  • Expertise with Kafka/AMQ, HashiCorp Vault, and ArgoCD/Jenkins/GitHub Actions.
  • Background leading incident response and postmortems with strong RCA and continuous improvement practices.


Work Model & Team

  • Hybrid: 2 days onsite in Reston, VA; 3 days remote.
  • You’ll be part of the IT organization, collaborating daily with developers, data engineers, infrastructure operations, and security.


How to Succeed Here

  • You’re a hands-on engineer who thrives in regulated, high-impact environments.
  • You favor automation over repetition, and observability over guesswork.
  • You collaborate openly, communicate clearly, and leave systems better than you found them.

Salary.com Estimation for SRE/Platform Engineer (OpenShift/Kubernetes) 4660 in Washington, WA
$98,763 to $116,616
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Tier4 Group

  • Tier4 Group Portland, OR
  • Sr. SAP OTC Analyst/Lead Consultant Full Time – Direct Hire Hybrid Schedule: 3 days onsite, 2 days remote 3 days onsite at any of the following locations: ... more
  • Just Posted

  • Tier4 Group Milwaukee, WI
  • Sr. SAP OTC Analyst/Lead Consultant Full Time – Direct Hire Hybrid Schedule: 3 days onsite, 2 days remote 3 days onsite at any of the following locations: ... more
  • 1 Day Ago

  • Tier4 Group Nashville, TN
  • Sr. SAP OTC Analyst/Lead Consultant Full Time – Direct Hire Hybrid Schedule: 3 days onsite, 2 days remote 3 days onsite at any of the following locations: ... more
  • 1 Day Ago

  • Tier4 Group Atlanta, GA
  • Tier4 Group is looking for an experienced AI/Machine Learning Engineer to build and operate production-grade AI systems that power automation, insights, an... more
  • 2 Days Ago


Not the job you're looking for? Here are some other SRE/Platform Engineer (OpenShift/Kubernetes) 4660 jobs in the Washington, WA area that may be a better fit.

  • tatari San Francisco, CA
  • Tatari is on a mission to revolutionize TV advertising. Founded in 2016 to help transform the antiquated world of TV advertising through the intelligent ap... more
  • 18 Days Ago

  • tatari Los Angeles, CA
  • Tatari is on a mission to revolutionize TV advertising. Founded in 2016 to help transform the antiquated world of TV advertising through the intelligent ap... more
  • 18 Days Ago

AI Assistant is available now!

Feel free to start your new journey!