Demo

Data Engineer

Prima Mente
San Francisco, CA Full Time
POSTED ON 12/30/2025
AVAILABLE BEFORE 1/28/2026
About Prima Mente

Prima Mente’s goal is to deeply understand the brain, to protect the brain from neurological disease and enhance the brain in health. We do this by generating our own data, building brain foundation models, and translating discovery to real clinical and research impact.

Role focus - Biological Data Infrastructure at Petabyte Scale

Key Tasks:

  • Owning and scaling our data infrastructure by several orders of magnitude to handle > 100 petabyte-scale multi-omic datasets, including data pipelines, distributed data processing, and storage systems
  • Building a unified feature store for all our ML models and biological data analysis workflows
  • Efficiently storing and loading petabytes of data for ML bio data
  • Processing and storing predictions and evaluation metrics for large-scale biological forecasting and analysis models
  • Implementing data versioning and point-in-time correctness systems for evolving biological datasets
  • Building observable, debuggable data pipelines that handle the complexity of multi-omic data sources

Expected Growth

In 1 month you will be responsible for analyzing current data infrastructure bottlenecks, implementing initial optimizations to existing pipelines, and beginning work on scaling our feature store infrastructure for ML models.

In 3 months you'll directly own and have scaled key components of our data processing systems, built prototype streaming pipelines for real-time data ingestion, and contributed to designing our unified feature store architecture.

In 6 months you'll have implemented high-performance petabyte-scale data infrastructure, established data versioning and point-in-time correctness systems, and delivered measurable improvements in data processing throughput and reliability.

Why Join Us

  • Meaningful Impact: Contribute directly to research infrastructure that powers discoveries potentially impacting millions of lives.
  • Innovation & Autonomy: Work at the forefront of AI and multi-omics, with the freedom to propose and implement state-of-the-art infrastructure solutions.
  • Exceptional Team: Collaborate with talented colleagues from diverse backgrounds across ML, bioinformatics, and engineering.
  • Growth Opportunities: Continuous learning and growth opportunities in a rapidly advancing technical field.

Who You Are

We don’t expect you to check every box. Strong applicants often have depth in some of these and interest in growing into others

  • 4 years of experience building data infrastructure or data platforms with demonstrated ability to solve complex distributed systems problems independently
  • Experience building infrastructure for large-scale data processing pipelines (both batch and streaming) using tools like Spark, Kafka, Apache Flink, Apache Beam, and with proprietary solutions like Nebius
  • Experience designing and implementing large-scale data storage systems (feature stores, timeseries DBs) for ML use cases, with strong familiarity with relational databases, data warehouses, object storage, and expertise in DB schema design
  • Experience with ML infrastructure and have worked at companies that use ML for core business functions
  • Experience building data pipelines for external data sources that are observable, debuggable, and verifiably correct, having dealt with challenges like data versioning, point-in-time correctness, and evolving schemas
  • Strong distributed systems and infrastructure skills - comfortable scaling and debugging Kubernetes services, writing Terraform, and working with orchestration tools like Flyte, Airflow, or Temporal
  • Experience with cloud platforms (AWS, GCP, Azure) and container technologies
  • Strong software engineering skills with ability to write easy-to-extend and well-tested code
  • Excellent communication skills and experience collaborating within multidisciplinary teams
  • Comfortable with ambiguity and a fast-moving environment, with a bias for action
  • Learn and pick up new skills quickly
  • Familiarity with bioinformatics or biological data handling
  • Knowledge of data governance, compliance, and security standards relevant to healthcare or biotech

Location

Based in San Francisco, US or London, UK. We support visa applications.

Culture Insight

What we are doing is extremely hard. Prima Mente is for great people. We are team players who appreciate challenges, want to be hands-on, and thrive on curiosity by throwing away assumptions. We are focused on excellence at pace and huge personal growth. We are strong communicators who are highly disciplined and rigorous.

Prima Mente operates with a flat organizational structure. We gain and share knowledge by contributing to multiple opportunities. Leadership is given to those who show initiative and consistently deliver excellence.

We arrange our lives so we can work in person as much as possible.

Our Values

Exceptional performance at exceptional pace

  • The solutions we build demand uncompromising quality and rigour.
  • The problems we are solving are grave and present.

Inquisitive discovery

  • We embrace curiosity and creativity.
  • Every question is a path to a transformational breakthrough.

Radical candour

  • We practice unwavering honesty and transparency in all our challenges and interactions.

Purposeful individuality

  • Every individual in our team is celebrated for their identity, uniqueness, and experiences.
  • We are invested in each one’s bespoke personal development.
  • Nurturing individuality will supercharge our collective purpose and spirit.

Patient impact at scale

  • We have a steadfast commitment to improve the health and well-being of patients globally.
  • Every experiment run, every dataset analysed, and every innovation developed, is a step towards achieving a scalable impact.

Salary.com Estimation for Data Engineer in San Francisco, CA
$100,149 to $126,375
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Data Engineer?

Sign up to receive alerts about other jobs on the Data Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,249 - $117,370
Income Estimation: 
$93,716 - $124,745
Income Estimation: 
$118,976 - $146,289
Income Estimation: 
$112,672 - $149,113
Income Estimation: 
$98,475 - $115,895
Income Estimation: 
$66,897 - $84,160
Income Estimation: 
$74,161 - $98,561
Income Estimation: 
$92,249 - $117,370
Income Estimation: 
$88,359 - $121,264
Income Estimation: 
$113,076 - $148,099
Income Estimation: 
$67,172 - $106,823
Income Estimation: 
$87,954 - $124,905
Income Estimation: 
$54,658 - $80,222
Income Estimation: 
$85,711 - $119,978
Income Estimation: 
$118,976 - $146,289
Income Estimation: 
$115,719 - $153,093
Income Estimation: 
$137,343 - $165,639
Income Estimation: 
$135,811 - $184,429
Income Estimation: 
$120,390 - $162,969
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Prima Mente

  • Prima Mente San Francisco, CA
  • Role Focus: Embed our Pleiades models with early partners - test possible research outcomes and product hypotheses About Prima Mente Prima Mente’s goal is ... more
  • 13 Days Ago

  • Prima Mente San Francisco, CA
  • About Prima Mente Prima Mente’s goal is to deeply understand the brain, to protect the brain from neurological disease and enhance the brain in health. We ... more
  • 13 Days Ago

  • Prima Mente San Francisco, CA
  • About Prima Mente Prima Mente is a frontier biology AI lab. We generate our own data, build general purpose biological foundation models, and translate dis... more
  • 15 Days Ago

  • Prima Mente San Francisco, CA
  • About Prima Mente Prima Mente is a frontier biology AI lab. We generate our own data, build general purpose biological foundation models, and translate dis... more
  • 15 Days Ago


Not the job you're looking for? Here are some other Data Engineer jobs in the San Francisco, CA area that may be a better fit.

  • Tessera Data San Francisco, CA
  • About Checkr Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr’s innovative technology and robust data pla... more
  • 1 Day Ago

  • Tessera Data San Francisco, CA
  • About Checkr Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr’s innovative technology and robust data pla... more
  • 11 Days Ago

AI Assistant is available now!

Feel free to start your new journey!