Demo

Infrastructure Engineer (Hybrid Cloud & Platform)

Aldea
San Francisco, CA Full Time
POSTED ON 11/27/2025 CLOSED ON 12/26/2025

What are the responsibilities and job description for the Infrastructure Engineer (Hybrid Cloud & Platform) position at Aldea?

Location: US Remote / Bay Area

Job Type: Full-time

Level: Mid-Level / Senior

About Aldea

Aldea is a multi-modal foundational AI company reimagining the scaling laws of intelligence. We believe today's architectures create unnecessary bottlenecks for the evolution of software. Our mission is to build the next generation of foundational models that power a more expressive, contextual, and intelligent human–machine interface.

The Mission

We are seeking an Infrastructure Engineer to bridge the gap between complex hybrid infrastructure and developer velocity. You will architect a unified platform spanning AWS and Bare Metal Kubernetes.

At this level, you bring technical direction and expertise to the table. You will participate in planning and discussion for architecting resilient infrastructure, drive cross-team initiatives, and mentor other engineers while remaining deeply hands-on. Your ultimate goal is to build a "Golden Path" for engineering: automated releases, deep observability, and a platform experience that feels invisible to the end user.

Key Responsibilities

  • Hybrid Infrastructure & Bare Metal (AWS K8s)
  • Unified IaC Strategy: Architect and maintain the Terraform codebase for both AWS services (EKS, RDS, VPC) and Bare Metal clusters. You will treat physical infrastructure as mutable software, using tools like Cluster API, Metal3, or Tinkerbell to manage hardware lifecycles.
  • Bare Metal Mastery: Manage multiple production clusters on bare metal with clear separation of environments. You will solve complex challenges including networking (BGP, ECMP), load balancing (MetalLB/Kube-VIP), and storage orchestration (CSI/Rook-Ceph) for stateful workloads.
  • Observability & AI Monitoring
  • Full-Stack Visibility: Contribute to building our stack (Prometheus, Grafana, ELK/Loki) to monitor both EKS and bare metal.
  • AI/GPU Telemetry: Build specialized dashboards for AI workloads. You will track GPU metrics, CPU saturation, and memory pressure to ensure efficient resource utilization.
  • CI/CD & Release Architecture
  • CI/CD at Scale: Architect resilient, multi-region pipelines using GitHub Actions. Automated CI/CD for apps using ArgoCD. You will build and manage a fleet of self-hosted runners to control costs and accelerate feedback loops.
  • Secure Release Engineering: Implement end-to-end workflows: Docker image build → Helm chart release → deployment (GH Actions ArgoCD). Semantic versioning, manage artifacts in centralized registries, and integrate vulnerability scanning.
  • Leadership & Collaboration
  • Technical Direction: Lead design reviews and drive platform roadmaps that balance reliability, cost, and developer productivity.
  • Cross-Functional Partnership: Partner with product, security, and application teams to translate business needs into robust platform capabilities.

Requirements

  • Experience: Infrastructure, DevOps, or SRE roles, with primary ownership of production systems in AWS and Bare Metal Kubernetes.
  • Technical Arsenal: Expert fluency in Terraform, Linux/Bash or Python scripting, and GitHub Actions, and ArgoCD
  • Bare Metal & K8s: Proven experience operating Kubernetes in production, including hybrid setups (EKS On-Prem). You understand networking (CNI, BGP), storage (CSI), and cluster lifecycle management.
  • Observability Depth: You have moved beyond "out-of-the-box" dashboards. You understand high-cardinality metrics, log retention strategies, and how to debug distributed systems.
  • Platform Mindset: You don't just build servers; you build products for developers.

Bonus

  • Experience with OpenTelemetry (OTEL) for unified tracing.
  • Understanding of eBPF
  • Experience configuring NVIDIA DCGM for GPU monitoring and handling AI training/inference workloads.

Aldea is proud to be an equal-opportunity employer. We are committed to building a diverse and inclusive culture that celebrates authenticity to win as one. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, disability, protected veteran status, citizenship or immigration status, or any other legally protected characteristics.

Aldea uses E-Verify to confirm employment eligibility in compliance with federal law. For more information please visit: https://www.e-verify.gov.

Please note: We do not accept unsolicited resumes from recruiters or employment agencies and will not be responsible for any fees related to unsolicited resumes.
Infrastructure Engineer
Mercor -
San Francisco, CA
Infrastructure Engineer
Outerbounds -
San Francisco, CA
Infrastructure Engineer
LangChain -
San Francisco, CA

Salary.com Estimation for Infrastructure Engineer (Hybrid Cloud & Platform) in San Francisco, CA
$115,694 to $145,622
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Infrastructure Engineer (Hybrid Cloud & Platform)?

Sign up to receive alerts about other jobs on the Infrastructure Engineer (Hybrid Cloud & Platform) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$158,960 - $205,707
Income Estimation: 
$154,509 - $200,187
Income Estimation: 
$71,493 - $96,419
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$137,568 - $176,908
This job has expired.
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Aldea

  • Aldea Fort Worth, TX
  • Ver más abajo para la versión en español Maintenance Supervisor Are you good at fixing things and leading a team? Do you enjoy keeping communities running ... more
  • 5 Days Ago

  • Aldea San Francisco, CA
  • About Aldea Aldea is building frontier AI infrastructure: high-accuracy speech-to-text, low-latency text-to-speech, and long-context LLM systems designed f... more
  • 5 Days Ago


Not the job you're looking for? Here are some other Infrastructure Engineer (Hybrid Cloud & Platform) jobs in the San Francisco, CA area that may be a better fit.

  • Accenture Infrastructure & Capital Projects, LLC San Francisco, CA
  • As Accenture continues to grow, we have an increasing number of career opportunities available to you. Depending on the job and location, you may be direct... more
  • 1 Month Ago

  • Accenture Infrastructure & Capital Projects, LLC San Francisco, CA
  • As Accenture continues to grow, we have an increasing number of career opportunities available to you. Depending on the job and location, you may be direct... more
  • 2 Months Ago

AI Assistant is available now!

Feel free to start your new journey!