Demo

Senior AI Infrastructure Engineer (LLMOps / MLOps)

AI Cybersecurity Company
San Jose, CA Full Time
POSTED ON 12/17/2025
AVAILABLE BEFORE 1/15/2026

Are you passionate about AI and eager to make a significant impact in the cybersecurity space?


Join us at our cutting-edge AI startup in the San Francisco Bay Area, where we are assembling a world-class team to tackle some of the most pressing challenges in cybersecurity.

As a Senior AI Infrastructure Engineer, you will own the design, deployment, and scaling of our AI infrastructure and production pipelines. You’ll bridge the gap between our AI research team and engineering organization, enabling the deployment of advanced LLM and ML models into secure, high-performance production systems.


You will build APIs, automate workflows, optimize GPU clusters, and ensure our models perform reliably in real-world cybersecurity applications. This role is ideal for someone who thrives in a startup environment — hands-on, cross-functional, and driven to build world-class AI systems from the ground up.


Why Join Us:

  • $25M Seed Funding: We are well-funded, with $25 million raised in our seed round, giving us the resources to innovate and scale rapidly.
  • Proven Early Success: We’ve already partnered with Fortune 500 companies, demonstrating market traction and trust in our AI-driven cybersecurity solutions.
  • Experienced Leadership: Our founders are second- and third-time entrepreneurs with 25 years in cybersecurity — having led companies to valuations exceeding $3B.
  • World-Class Leadership Team: Heads of AI, Engineering, and Product come from top global tech companies, ensuring best-in-class mentorship and technical direction.
  • Cutting-Edge AI Solutions: We leverage the most advanced AI technologies, including Large Language Models (LLMs), Generative AI, and intelligent inference systems.
  • Generous Compensation: Competitive salary, meaningful equity, and a high-growth environment where your impact is recognized and rewarded.
  • Cybersecurity Knowledge Preferred but Not Required: We value strong AI/ML and infrastructure engineering talent above all — cybersecurity expertise can be learned on the job.



Key Responsibilities:


Core (Mission-Critical)

  • Own and manage the AI infrastructure stack — GPU clusters, vector databases, and model serving frameworks (vLLM, Triton, Ray, or similar).
  • Productionize LLMs and ML models developed by the AI team, deploying them into secure, monitored, and scalable environments.
  • Design and maintain REST/gRPC APIs for inference and automation, integrating tightly with the core cybersecurity platform.
  • Collaborate closely with AI scientists, backend engineers, and DevOps to streamline deployment workflows and ensure production reliability.


Infrastructure & Reliability

  • Build and maintain infrastructure-as-code (IaC) setups using Terraform or Pulumi for reproducible environments.
  • Implement observability and monitoring — latency, throughput, model drift, and uptime dashboards with Prometheus / Grafana / OpenTelemetry.
  • Automate CI/CD pipelines for model training, validation, and deployment using GitHub Actions, ArgoCD, or similar tools.
  • Architect scalable, hybrid AI systems across on-prem and cloud, enabling cost-effective compute scaling and fault tolerance.


Security, Data, and Performance

  • Enforce data privacy and compliance across AI pipelines (SOC2, encryption, access control, VPC isolation).
  • Manage data and model artifacts, including versioning, lineage tracking, and storage for models, checkpoints, and embeddings.
  • Optimize inference latency, GPU utilization, and throughput, using batching, caching, or quantization techniques.
  • Build fallback and failover mechanisms to maintain service reliability in case of model or API failure.


Innovation & Leadership

  • Research and integrate emerging LLMOps and MLOps tools (e.g., LangGraph, Vertex AI, Ollama, Triton, Hugging Face TGI).
  • Create sandbox environments for AI researchers to experiment safely.
  • Lead cost optimization and capacity planning, forecasting GPU and cloud needs.
  • Document and maintain runbooks, architecture diagrams, and standard operating procedures.
  • Mentor junior engineers and contribute to a culture of operational excellence and continuous improvement.


Qualifications:


Required

  • 5 years of experience in ML Infrastructure, MLOps, or AI Platform Engineering.
  • Proven expertise with LLM serving, distributed systems, and GPU orchestration (e.g., Kubernetes, Ray, or vLLM).
  • Strong programming skills in Python and experience building APIs (FastAPI, Flask, gRPC).
  • Proficiency with cloud platforms (Azure, AWS, or GCP) and IaC tools (Terraform, Pulumi).
  • Solid understanding of CI/CD, Docker, containerization, and model registry practices.
  • Experience implementing observability, monitoring, and fault-tolerant deployments.


Preferred

  • Familiarity with vector databases (FAISS, Pinecone, Weaviate, Qdrant).
  • Exposure to security or compliance-focused environments.
  • Experience with PyTorch / TensorFlow and MLflow / Weights & Biases.
  • Knowledge of distributed training or large-scale inference optimization (DeepSpeed, TensorRT, Quantization).
  • Prior work at startups or fast-paced R&D-to-production environments.


Our Culture & Team

  • Collaborative Environment: Join a fast-moving, innovation-driven startup where every engineer has a direct impact.
  • World-Class Leadership: Mentorship from leaders with deep expertise in AI, ML, and cybersecurity.
  • Growth Opportunities: Access to professional development, top-tier conferences, and bleeding-edge AI projects.
  • Diversity and Inclusion: We believe that diverse perspectives drive stronger innovation.


Perks & Benefits

  • Comprehensive health, dental, and vision insurance.
  • Wellness and professional development stipends.
  • Equity options — share in the company’s success.
  • Access to the latest tools and GPUs for AI/ML development.

Salary.com Estimation for Senior AI Infrastructure Engineer (LLMOps / MLOps) in San Jose, CA
$155,685 to $197,232
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior AI Infrastructure Engineer (LLMOps / MLOps)?

Sign up to receive alerts about other jobs on the Senior AI Infrastructure Engineer (LLMOps / MLOps) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
Income Estimation: 
$129,363 - $167,316
Income Estimation: 
$145,845 - $177,256
Income Estimation: 
$147,836 - $182,130
Income Estimation: 
$154,597 - $194,610
Income Estimation: 
$86,891 - $130,303
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at AI Cybersecurity Company

  • AI Cybersecurity Company San Jose, CA
  • Staff / Senior Staff Engineer (Golang) — AI-Powered Cybersecurity Startup | SF Bay Area Are you a seasoned engineer who thrives on building, leading, and d... more
  • 12 Days Ago

  • AI Cybersecurity Company San Jose, CA
  • We're Hiring: DevOps Engineer – DataOps (SF Bay Area) Do you get excited about turning complex ideas into sleek, responsive interfaces that just work? We’r... more
  • 12 Days Ago

  • AI Cybersecurity Company San Jose, CA
  • We're Hiring: DevOps Engineer – DataOps (SF Bay Area) Do you get excited about turning complex ideas into sleek, responsive interfaces that just work? We’r... more
  • 11 Days Ago


Not the job you're looking for? Here are some other Senior AI Infrastructure Engineer (LLMOps / MLOps) jobs in the San Jose, CA area that may be a better fit.

  • Boson AI Santa Clara, CA
  • About The RoleWe're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packe... more
  • 2 Months Ago

  • AMD Santa Clara, CA
  • WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data... more
  • 25 Days Ago

AI Assistant is available now!

Feel free to start your new journey!