Demo

Technical Senior Principal | Site Reliability Engineer | Kubernetes

GM Financial
Arlington, TX Full Time
POSTED ON 4/5/2026
AVAILABLE BEFORE 5/4/2026
Job Description

Why GMF Technology?

GM Financial is set to change the auto finance industry and is leading the path of embarking on tech modernization – we have a startup mindset, and preserve our small company culture, in a public company environment with financial stability and intense growth over a decade-plus history. We are data junkies and trust in data and insights to advance our business objectives. We take our goal of zero emission, zero collision, zero congestion, and zero friction very seriously. We believe as an auto finance market leader we are in the driver's seat to lead us in the GM EV mission to change the world. We are building global platforms, in LATAM, Europe, China, U.S. and Canada– and we are looking to grow our high-performing team. GMF is comprised of over 10,000 team members globally. Join our fintech culture within a Blue-Chip company where we are changing the way we use technology to support our customers, dealers and business.

Flexible hybrid work environment (onsite 3 days a week/2 days remote) at our Arlington (AOC1), TX office.

Responsibilities

About the Role

As a Senior Principal SRE, you will be the technical bar‑raiser for our centralized Kubernetes platform—setting strategy, owning reliability at fleet scale, and leading cross‑org engineering to deliver a self‑service, secure, and compliant platform. You will partner with Architecture, BPS, Cloud Ops, and Cyber to turn our roadmap into durable, automated capabilities that product teams adopt with minimal toil.

Top Outcomes You Will Drive

  • Fleet‑level reliability strategy for shared and dedicated clusters, defining SLOs/SLIs and error budgets for the platform and golden patterns, with automated enforcement and reporting.
  • Self‑service at scale: deliver Namespace‑as‑a‑Service and developer‑portal workflows that shrink onboarding from weeks to hours and unlock safe autonomy for product teams.
  • Observability by default: land built‑in cluster/workload dashboards (Splunk APM Azure Monitor/App Insights) and a robust RCA/Problem‑Management loop that closes the gap between incidents and engineering improvements.
  • Multi‑cloud readiness: guide centralized Kubernetes deployment expansion to AWS and design portable patterns (identity, networking, GitOps) that remain cloud‑agnostic.
  • Secure networking & policy: lead adoption of Calico Enterprise (DNS‑based policy, honey pods, central policy mgmt.) and staged rollout of stretched mesh/identity‑based access across clusters.
  • Path to a Kubernetes-as-a-Serverless : influence the architecture that abstracts K8s, integrates pre‑connected services, and enforces governance/consistency with a service catalog and on‑demand APIs.
  • Scale the operating model: codify the RACI, reduce reactive workload, shift‑left with support enablement, and build automation that lets a small core team support a large fleet.

Core Responsibilities

  • Own multi‑cluster reliability: capacity modeling, failure domain strategy, upgrade design (blue/green, surge, or secondary‑cluster) and chaos/DR exercises across shared & dedicated environments.
  • Define and implement platform SLOs/SLIs (control plane, base stack, onboarding, GitOps, network policy propagation, secret/cert rotation) with automated alerts and error‑budget policies.
  • Lead the design/implementation of Namespace‑as‑a‑Service; measure adoption, lead time, and customer effort score.
  • Establish GitOps standards (Argo CD) for app and cluster configuration, including bootstrap, drift detection, and progressive delivery (blue/green, canary).
  • Architect and land Calico/Tigera Enterprise and/or service mesh patterns (east‑west controls, identity‑based policies, multi‑cluster traffic mgmt.), with guardrails and paved‑road configs.
  • Lead security & compliance by default: SR controls, RBAC baselines (Azure RBAC/workload identity), cert‑manager automation, patch cadence, and auditable change pipelines.
  • Serve as principal‑level incident commander and RCA owner for platform incidents; convert findings into backlog items, patterns, and training.
  • Partner with the necessary teams to scale operations and refine RACI; implement charge/show‑back models for high‑touch migrations when appropriate.
  • Mentor Staff/Principal engineers; raise the bar on design docs, ADRs, runbooks, and knowledge sharing across the platform and product teams.

Qualifications

What makes you a dream candidate?

Knowledge And Skills

  • Deep experience with GitOps (Argo CD), service mesh (Istio/Linkerd), Calico/Tigera, cert‑manager, secret engines, and workload identity.
  • Strong IaC/automation: Terraform, Azure DevOps (YAML), CI/CD policy gates, automated security controls.
  • Observability at scale: Splunk APM, Azure Monitor, Application Insights; golden dashboards and SLO pipelines.
  • Distributed systems fundamentals: performance, scalability, capacity, and reliability.
  • Excellent communication; ability to lead across org boundaries and mentor senior engineers.

Experience And Education

  • High School Diploma or equivalent required
  • Bachelor’s Degree or Associate Degree plus 2 additional years of relevant experience required
  • 12 years in related function(s) required
  • 5-7 years of experience leading through mentorship in related field required
  • 5-7 years of experience driving thought leadership and innovation across products required

Preferred Skills

  • Multi‑cluster and multi‑region upgrade strategies (surge/blue‑green), active‑active patterns, and zero‑downtime migrations.
  • Network policy at scale (DNS‑based policies), L7 authorization, east‑west security controls.
  • Self‑service developer portals and onboarding workflows; measuring adoption and customer effort.
  • FinOps for Kubernetes (charge/show‑back, pod‑level cost breakdown), quota guardrails, and capacity/right‑sizing automation.
  • Experience with Kubernetes platform abstraction and curated service catalogs.
  • Expert in SRE: SLO/SLI design, error budgets, incident command, RCA/Problem Management, chaos/DR.

What We Offer: Generous benefits package available on day one to include: 401K matching, bonding leave for new parents (12 weeks, 100% paid), tuition assistance, training, GM employee auto discount, community service pay and nine company holidays.

Our Culture: Our team members define and shape our culture — an environment that welcomes innovative ideas, fosters integrity, and creates a sense of community and belonging. Here we do more than work — we thrive.

Compensation: Competitive pay and bonus eligibility

Work Life Balance: Flexible hybrid work environment, 2-days a week in office

#GMFjobs

Salary.com Estimation for Technical Senior Principal | Site Reliability Engineer | Kubernetes in Arlington, TX
$174,469 to $209,688
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at GM Financial

  • GM Financial Charlotte, NC
  • Job Description Why Quantitative Pricing at GM Financial? The Sr Quantitative Analyst - Pricing plays a key role in enhancing pricing strategies and proces... more
  • 9 Days Ago

  • GM Financial Arlington, TX
  • Job Description Why GM Financial Technology? Innovation isn’t just a talking point at GM Financial, it’s how we operate. From generative AI and cloud-nativ... more
  • 9 Days Ago

  • GM Financial Fort Worth, TX
  • Job Description Why GMF Process Excellence? The Process Excellence Team at GM Financial is dedicated to enhancing operational efficiency and driving contin... more
  • 9 Days Ago

  • GM Financial Arlington, TX
  • Job Description Why GMF Technology? At GM Financial, innovation drives everything we do. We’re not just adopting technology — we’re shaping the future of s... more
  • 9 Days Ago


Not the job you're looking for? Here are some other Technical Senior Principal | Site Reliability Engineer | Kubernetes jobs in the Arlington, TX area that may be a better fit.

  • Fidelity Investments Roanoke, TX
  • Job Description Note: Fidelity will not provide immigration sponsorship for this position Our Site Reliability Engineering group within Enterprise Infrastr... more
  • 12 Days Ago

  • Experis Southlake, TX
  • Our client, a leading organization in the financial services industry, is seeking a SRE - Site Reliability Engineer - Senior to join their team. As a SRE -... more
  • 4 Days Ago

AI Assistant is available now!

Feel free to start your new journey!