Demo

AI / ML Developer (Senior)

Infogain
Menlo Park, CA Full Time
POSTED ON 4/5/2026
AVAILABLE BEFORE 5/2/2026
Roles & Responsibilities

We are seeking an experienced AI/ML Infrastructure & Ops Engineer to build, scale, and maintain the critical infrastructure that powers our AI models and autonomous agents. In this role, you will act as the bridge between our AI research/development teams and our production environments. You will not just be deploying models; you will be designing the high-performance, distributed systems required to serve Large Language Models (LLMs), orchestrate multi-agent workflows, and optimize GPU compute at scale.

If you are passionate about turning complex AI capabilities into highly reliable, scalable, and cost-efficient production systems, this is the role for you.

Key Responsibilities

  • Machine Learning Infrastructure & Serving
  • Design, build, and manage scalable infrastructure for training, fine-tuning, and serving LLMs and multimodal models.
  • Optimize inference latency, throughput, and cost using modern serving frameworks (e.g., vLLM, Triton Inference Server, Ray Serve) [2].
  • Manage and orchestrate GPU/TPU clusters, ensuring high utilization and efficient resource allocation.
  • Building and Scaling Agentic Operations (AgentOps)
  • Architect and deploy infrastructure to support autonomous AI agents and multi-agent systems.
  • Integrate and maintain agent orchestration frameworks (e.g., LangGraph, CrewAI) within production environments [3].
  • Build robust state management and memory systems (vector databases, graph databases) required for agentic workflows.
  • Observability, Evaluation, and Reliability
  • Implement comprehensive observability stacks tailored for LLMs and agents (tracing, prompt logging, cost tracking) using tools like Langfuse, Arize, or Datadog [4].
  • Design automated evaluation pipelines to monitor agent performance, safety, and reliability in real-time (LLMOps/AgentOps).
  • Act as the first line of defense for production AI systems, diagnosing and resolving issues related to memory limits, inference queues, and cluster failures.
  • Developer Platform & CI/CD for AI
  • Build internal developer platforms and tooling that allow AI engineers and data scientists to easily deploy models and agents to production.
  • Adapt traditional CI/CD pipelines to accommodate model versioning, prompt management, and continuous evaluation.

Qualifications

Required Skills:

  • Systems Engineering: Strong background in distributed systems, backend engineering, or DevOps/SRE.
  • Programming: Proficiency in Python (essential for the AI ecosystem) and systems languages like Go or Rust.
  • Containerization & Orchestration: Deep expertise in Kubernetes (K8s), Docker, and infrastructure-as-code (Terraform, Pulumi).
  • AI/ML Tooling: Hands-on experience with LLM serving engines (vLLM, TGI, Triton) and distributed computing frameworks (Ray) [2].
  • Agent Frameworks: Familiarity with modern agentic development frameworks like LangChain, LangGraph, or CrewAI [3].
  • Cloud & Hardware: Experience managing high-performance compute (GPUs/TPUs) on major cloud providers (AWS, GCP, Azure) or bare-metal clusters.

Preferred Skills

  • Experience with vector databases (Pinecone, Milvus, Qdrant) and retrieval-augmented generation (RAG) pipelines.
  • Understanding of model optimization techniques (quantization, LoRA, KV caching).
  • Previous experience building platforms from the ground up in a high-growth environment.

Experience

  • 6-8 Years

Skills

  • Primary Skill: AI/ML Development
  • Sub Skill(s): AI/ML Development
  • Additional Skill(s): AI/ML Development, TensorFlow

About The Company

Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).

Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.

Salary.com Estimation for AI / ML Developer (Senior) in Menlo Park, CA
$139,167 to $179,210
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a AI / ML Developer (Senior)?

Sign up to receive alerts about other jobs on the AI / ML Developer (Senior) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Infogain

  • Infogain Dallas, TX
  • Associate Software Development Engineer in Test (SDET) Position Overview The primary responsibility of the Associate SDET is to support software developmen... more
  • 8 Days Ago

  • Infogain Plano, TX
  • Position Overview The primary responsibility of the Senior SDET is to design and implement robust testing frameworks and strategies that ensure high-qualit... more
  • 8 Days Ago

  • Infogain Boston, MA
  • Qualifications/Skills: Minimum CCNA/JNCIA certification; CCNP (Routing and Switching) preferred. Familiarity with PoP infrastructure concepts including MMR... more
  • 14 Days Ago

  • Infogain Fremont, CA
  • About the Company: We combine people, platforms, and software Infogain is a human-centered digital platform and software engineering company based out of S... more
  • 3 Days Ago


Not the job you're looking for? Here are some other AI / ML Developer (Senior) jobs in the Menlo Park, CA area that may be a better fit.

  • Vectra AI San Jose, CA
  • Vectra® is the leader in AI-driven threat detection and response for hybrid and multi-cloud enterprises. The Vectra AI Platform delivers integrated signal ... more
  • 29 Days Ago

  • Intelliswift - An LTTS Company Sunnyvale, CA
  • Creo API / Toolkit CAD Automation Geometry Modeling Neural Networks (NNs) Large Language Models (LLMs) Python / C 2D Drawing Generation more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!