Demo

Research Engineer - Data Infrastructure

Seer
Palo Alto, CA Full Time
POSTED ON 5/28/2026
AVAILABLE BEFORE 6/25/2026

Senior / Staff Data Infrastructure Machine Learning Engineer


We are building advanced intelligent systems designed to operate in complex real-world environments. Our team develops the full stack — from high-performance hardware and distributed systems infrastructure to large-scale machine learning platforms and multimodal foundation models.


Backed by significant funding and operating at the intersection of AI, infrastructure, and large-scale systems engineering, we are investing heavily in research, infrastructure, and production-scale deployment to build next-generation intelligent systems.

We are hiring Senior and Staff-level Data Infrastructure Machine Learning Engineers to scale the systems powering our ML training data platform — from ingestion and storage to indexing, retrieval, observability, and throughput optimization across massive multimodal datasets.


What You’ll Do

Build and Scale High-Throughput Data Infrastructure

  • Architect, build, and operate distributed data infrastructure capable of processing and managing billions of video and multimodal data samples
  • Design systems with strong guarantees around reliability, latency, scalability, and cost efficiency
  • Optimize cloud object storage, metadata systems, databases, and large-scale distributed storage architectures


Develop Large-Scale Indexing and Retrieval Systems

  • Build efficient indexing and retrieval systems to support rapid dataset querying, filtering, and iteration
  • Improve data access patterns and retrieval performance for research and production ML workflows
  • Design scalable metadata and search infrastructure for multimodal datasets

Improve Observability and Reliability


  • Develop monitoring, alerting, failure recovery, and performance optimization frameworks for large-scale data pipelines
  • Build tooling to identify bottlenecks and improve operational visibility across distributed systems
  • Optimize workload balancing and throughput across distributed compute and storage infrastructure


Manage Data Lifecycle and Reproducibility

  • Build systems for artifact management, dataset versioning, lineage tracking, and reproducibility across training workflows
  • Ensure traceability and consistency across evolving datasets and training runs
  • Develop lightweight internal tooling enabling engineers and researchers to explore and analyze data at scale


Support ML and Vision-Language Workloads

  • Integrate and scale vision-language model (VLM) inference within distributed data pipelines
  • Support automated enrichment, filtering, metadata generation, and preprocessing workflows
  • Collaborate closely with ML systems and research teams to improve data quality and training velocity


What We’re Looking For

  • 5 years of experience in data infrastructure, distributed systems, ML infrastructure, or related fields
  • Strong experience building and operating large-scale distributed data pipelines
  • Deep understanding of:
  • Distributed systems architecture
  • Databases and metadata systems
  • Indexing and retrieval strategies
  • Cloud storage architectures
  • Experience optimizing throughput, workload balancing, and cost-performance tradeoffs in cloud environments
  • Hands-on experience with distributed processing frameworks such as Ray or Spark
  • Strong observability, monitoring, and production reliability experience
  • Strong software engineering fundamentals with the ability to own systems end-to-end


Level Expectations

  • Senior engineers are expected to execute complex systems work with strong technical depth and increasing ownership
  • Staff-level engineers are expected to define architectural direction, drive technical strategy, and independently lead major infrastructure initiatives


Preferred Experience

  • Experience managing large multimodal datasets
  • Familiarity with ML training workflows and data lifecycle management
  • Experience running large-scale ML inference workloads in distributed or cloud environments
  • Familiarity with vision-language models (VLMs)
  • Experience working with real-world sensor data such as video, telemetry, or time-series streams
  • Familiarity with data warehouse technologies such as Snowflake, BigQuery, or Redshift
  • Experience with data versioning and lineage systems such as DVC, Delta Lake, or similar tooling


Why This Role Matters

  • Build the foundational data infrastructure that directly impacts model quality and system performance
  • Collaborate closely with ML systems and research teams on problems with immediate and measurable impact
  • Operate with high ownership in a small, highly technical environment
  • Help scale intelligent systems operating in real-world environments


About the Company

We are a research-driven AI company focused on building scalable intelligent systems capable of robust operation in dynamic environments. By combining advances in machine learning, distributed systems, and infrastructure engineering, we aim to push the frontier of large-scale AI systems.


We are committed to building an inclusive and diverse workplace and encourage applicants from all backgrounds to apply.

Salary : $250,000 - $400,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Research Engineer - Data Infrastructure?

Sign up to receive alerts about other jobs on the Research Engineer - Data Infrastructure career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$113,077 - $147,784
Income Estimation: 
$135,356 - $164,911
Income Estimation: 
$153,902 - $198,246
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$149,493 - $192,976
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Seer

  • Seer Seattle, WA
  • Product Manager — AI-Native Travel Experience (Agentic Systems) We are building a deeply personal, AI-native travel companion designed to become the defaul... more
  • 6 Days Ago

  • Seer Palo Alto, CA
  • VP of Hardware Engineering — Robotics Systems We are building the next generation of intelligent robotic systems designed to operate reliably in complex re... more
  • 6 Days Ago

  • Seer Palo Alto, CA
  • Senior / Staff Research Scientist & Research Engineer — Reasoning and Planning We are building next-generation intelligent systems capable of operating aut... more
  • 6 Days Ago

  • Seer California, CA
  • Research Engineer — Post-Training, Alignment & Reasoning Systems We are building advanced AI systems focused on reasoning, generalization, and controllable... more
  • 7 Days Ago


Not the job you're looking for? Here are some other Research Engineer - Data Infrastructure jobs in the Palo Alto, CA area that may be a better fit.

  • ECLARO Los Altos, CA
  • Job Number: 26-00173 Use your skills where innovative technology solutions begin. ECLARO is looking for a Data Engineer - Autonomous Vehicle AI Research In... more
  • 9 Days Ago

  • RoboForce Milpitas, CA
  • Why RoboForce RoboForce is an AI robotics company developing Physical AI–powered Robo-Labor for dull, dirty, and dangerous work. The company's robots are e... more
  • 14 Days Ago

AI Assistant is available now!

Feel free to start your new journey!