What are the responsibilities and job description for the Technical Senior Principal | Site Reliability Engineer | Kubernetes position at GM Financial?
Job Description
Why GMF Technology?
GM Financial is set to change the auto finance industry and is leading the path of embarking on tech modernization – we have a startup mindset, and preserve our small company culture, in a public company environment with financial stability and intense growth over a decade-plus history. We are data junkies and trust in data and insights to advance our business objectives. We take our goal of zero emission, zero collision, zero congestion, and zero friction very seriously. We believe as an auto finance market leader we are in the driver's seat to lead us in the GM EV mission to change the world. We are building global platforms, in LATAM, Europe, China, U.S. and Canada– and we are looking to grow our high-performing team. GMF is comprised of over 10,000 team members globally. Join our fintech culture within a Blue-Chip company where we are changing the way we use technology to support our customers, dealers and business.
Flexible hybrid work environment (onsite 3 days a week/2 days remote) at our Arlington (AOC1), TX office.
Responsibilities
About the Role
As a Senior Principal SRE, you will be the technical bar‑raiser for our centralized Kubernetes platform—setting strategy, owning reliability at fleet scale, and leading cross‑org engineering to deliver a self‑service, secure, and compliant platform. You will partner with Architecture, BPS, Cloud Ops, and Cyber to turn our roadmap into durable, automated capabilities that product teams adopt with minimal toil.
Top Outcomes You Will Drive
What makes you a dream candidate?
Knowledge And Skills
Our Culture: Our team members define and shape our culture — an environment that welcomes innovative ideas, fosters integrity, and creates a sense of community and belonging. Here we do more than work — we thrive.
Compensation: Competitive pay and bonus eligibility
Work Life Balance: Flexible hybrid work environment, 2-days a week in office
#GMFjobs
Why GMF Technology?
GM Financial is set to change the auto finance industry and is leading the path of embarking on tech modernization – we have a startup mindset, and preserve our small company culture, in a public company environment with financial stability and intense growth over a decade-plus history. We are data junkies and trust in data and insights to advance our business objectives. We take our goal of zero emission, zero collision, zero congestion, and zero friction very seriously. We believe as an auto finance market leader we are in the driver's seat to lead us in the GM EV mission to change the world. We are building global platforms, in LATAM, Europe, China, U.S. and Canada– and we are looking to grow our high-performing team. GMF is comprised of over 10,000 team members globally. Join our fintech culture within a Blue-Chip company where we are changing the way we use technology to support our customers, dealers and business.
Flexible hybrid work environment (onsite 3 days a week/2 days remote) at our Arlington (AOC1), TX office.
Responsibilities
About the Role
As a Senior Principal SRE, you will be the technical bar‑raiser for our centralized Kubernetes platform—setting strategy, owning reliability at fleet scale, and leading cross‑org engineering to deliver a self‑service, secure, and compliant platform. You will partner with Architecture, BPS, Cloud Ops, and Cyber to turn our roadmap into durable, automated capabilities that product teams adopt with minimal toil.
Top Outcomes You Will Drive
- Fleet‑level reliability strategy for shared and dedicated clusters, defining SLOs/SLIs and error budgets for the platform and golden patterns, with automated enforcement and reporting.
- Self‑service at scale: deliver Namespace‑as‑a‑Service and developer‑portal workflows that shrink onboarding from weeks to hours and unlock safe autonomy for product teams.
- Observability by default: land built‑in cluster/workload dashboards (Splunk APM Azure Monitor/App Insights) and a robust RCA/Problem‑Management loop that closes the gap between incidents and engineering improvements.
- Multi‑cloud readiness: guide centralized Kubernetes deployment expansion to AWS and design portable patterns (identity, networking, GitOps) that remain cloud‑agnostic.
- Secure networking & policy: lead adoption of Calico Enterprise (DNS‑based policy, honey pods, central policy mgmt.) and staged rollout of stretched mesh/identity‑based access across clusters.
- Path to a Kubernetes-as-a-Serverless : influence the architecture that abstracts K8s, integrates pre‑connected services, and enforces governance/consistency with a service catalog and on‑demand APIs.
- Scale the operating model: codify the RACI, reduce reactive workload, shift‑left with support enablement, and build automation that lets a small core team support a large fleet.
- Own multi‑cluster reliability: capacity modeling, failure domain strategy, upgrade design (blue/green, surge, or secondary‑cluster) and chaos/DR exercises across shared & dedicated environments.
- Define and implement platform SLOs/SLIs (control plane, base stack, onboarding, GitOps, network policy propagation, secret/cert rotation) with automated alerts and error‑budget policies.
- Lead the design/implementation of Namespace‑as‑a‑Service; measure adoption, lead time, and customer effort score.
- Establish GitOps standards (Argo CD) for app and cluster configuration, including bootstrap, drift detection, and progressive delivery (blue/green, canary).
- Architect and land Calico/Tigera Enterprise and/or service mesh patterns (east‑west controls, identity‑based policies, multi‑cluster traffic mgmt.), with guardrails and paved‑road configs.
- Lead security & compliance by default: SR controls, RBAC baselines (Azure RBAC/workload identity), cert‑manager automation, patch cadence, and auditable change pipelines.
- Serve as principal‑level incident commander and RCA owner for platform incidents; convert findings into backlog items, patterns, and training.
- Partner with the necessary teams to scale operations and refine RACI; implement charge/show‑back models for high‑touch migrations when appropriate.
- Mentor Staff/Principal engineers; raise the bar on design docs, ADRs, runbooks, and knowledge sharing across the platform and product teams.
What makes you a dream candidate?
Knowledge And Skills
- Deep experience with GitOps (Argo CD), service mesh (Istio/Linkerd), Calico/Tigera, cert‑manager, secret engines, and workload identity.
- Strong IaC/automation: Terraform, Azure DevOps (YAML), CI/CD policy gates, automated security controls.
- Observability at scale: Splunk APM, Azure Monitor, Application Insights; golden dashboards and SLO pipelines.
- Distributed systems fundamentals: performance, scalability, capacity, and reliability.
- Excellent communication; ability to lead across org boundaries and mentor senior engineers.
- High School Diploma or equivalent required
- Bachelor’s Degree or Associate Degree plus 2 additional years of relevant experience required
- 12 years in related function(s) required
- 5-7 years of experience leading through mentorship in related field required
- 5-7 years of experience driving thought leadership and innovation across products required
- Multi‑cluster and multi‑region upgrade strategies (surge/blue‑green), active‑active patterns, and zero‑downtime migrations.
- Network policy at scale (DNS‑based policies), L7 authorization, east‑west security controls.
- Self‑service developer portals and onboarding workflows; measuring adoption and customer effort.
- FinOps for Kubernetes (charge/show‑back, pod‑level cost breakdown), quota guardrails, and capacity/right‑sizing automation.
- Experience with Kubernetes platform abstraction and curated service catalogs.
- Expert in SRE: SLO/SLI design, error budgets, incident command, RCA/Problem Management, chaos/DR.
Our Culture: Our team members define and shape our culture — an environment that welcomes innovative ideas, fosters integrity, and creates a sense of community and belonging. Here we do more than work — we thrive.
Compensation: Competitive pay and bonus eligibility
Work Life Balance: Flexible hybrid work environment, 2-days a week in office
#GMFjobs