Demo

Principal Engineer, High Performance Data & Algorithm Infrastructure

Foresite Labs
San Diego, CA Full Time
POSTED ON 4/5/2026
AVAILABLE BEFORE 5/3/2026
Principal Engineer, High-Performance Data Pipeline & Infrastructure

Location: San Diego, CA

Job Type: Full-Time

Salary Range: $258,000 - $275,000

Position Overview

We are looking for a Principal Engineer to architect, build, and own the

end-to-end data pipeline that drives our high-throughput diagnostic

instrument platform — from real-time image acquisition on the

instrument, through GPU-accelerated signal processing, to offloading

for secondary and tertiary analysis on local HPC clusters and cloud

infrastructure.

This is a technical leadership role for an engineer who can design and

deliver industrial-grade data processing infrastructure that operates

reliably at sustained high throughput. You will be responsible for the

full data path: acquiring raw image data from sensors, processing it

through GPU pipelines, orchestrating job distribution across local HPC

and cloud compute, and ensuring the entire system handles errors,

backpressure, and recovery gracefully. The scope spans instrument-

embedded software, on-premises Linux HPC infrastructure, and cloud-

based compute and storage.

The central challenge of this role is not raw compute optimization —

GPU and CPU resources will have adequate headroom. The challenge is

building a pipeline architecture that is robust, scalable, and evolvable

as instrument throughput increases with each generation, the number

of instruments grows, and data volumes scale accordingly. You will

design systems that keep a complex multi-stage pipeline running

continuously and reliably in a production lab environment, and that

Can Be Evolved Without Wholesale Re-architecture As Requirements

intensify.

Key Responsibilities

End-to-End Data Pipeline Architecture

  • Own the architecture of the complete data path from image acquisition to final processed output
  • Design pipeline stages with clear interfaces, flow control, and backpressure mechanisms
  • Ensure the pipeline sustains continuous high-throughput operation across extended instrument runs
  • Define data formats, handoff protocols, and buffering strategies between pipeline stages
  • Architect for graceful degradation — the system must handle transient failures without data loss or pipeline stalls
  • Establish performance budgets and SLAs for each pipeline stage and monitor adherence

Image Acquisition & On-Instrument Processing

  • Develop and optimize real-time image acquisition from high-speed sensors on the instrument
  • Implement low-latency, high-bandwidth data capture with minimal frame loss
  • Design on-instrument preprocessing stages that reduce data volume before offload
  • Manage memory and storage constraints within the instrument compute environment
  • Ensure deterministic, repeatable performance under sustained acquisition loads

GPU-Accelerated Signal & Image Processing

  • Develop and maintain GPU compute pipelines using CUDA for signal and image processing
  • Implement DSP algorithms including frequency-domain analysis, deconvolution, filtering, and detection
  • Manage host-to-GPU data transfers and ensure efficient use of GPU resources
  • Profile GPU workloads to identify issues and validate performance headroom
  • Balance numerical accuracy against throughput requirements

Job Orchestration & Distributed Processing

  • Design and implement job queuing, scheduling, and orchestration across instrument, local HPC, and cloud compute
  • Build robust work distribution that maximizes resource utilization across heterogeneous compute
  • Implement backpressure handling so upstream stages throttle gracefully when downstream is saturated
  • Design comprehensive error handling, retry logic, and dead-letter strategies for failed jobs
  • Ensure jobs are idempotent and recoverable — partial failures must not corrupt the pipeline
  • Implement priority scheduling to balance real-time instrument processing with batch reprocessing
  • Monitor queue depths, processing latencies, and resource utilization with actionable alerting

Linux Systems & Performance

  • Configure and tune Linux systems for reliable, high-throughput operation across instrument and HPC nodes
  • Tune kernel parameters (scheduler, NUMA, IRQs, huge pages) as needed for stable pipeline performance
  • Understand and manage DMA paths, PCIe topology, and device-to- memory data movement
  • Profile and diagnose system-level issues using perf, ftrace, eBPF, and similar tools
  • Ensure system configurations are reproducible and documented across instrument and HPC environments

HPC Compute Platform & Algorithm Infrastructure (co- owned with DevOps)

  • Co-design the HPC compute platform architecture with DevOps — define computational requirements, job flow, and data access patterns while DevOps provisions and manages the infrastructure
  • Define how algorithms are deployed, versioned, and rolled into production on the HPC platform — support safe side-by-side execution of new and existing algorithm versions
  • Design compute allocation strategies that balance real-time instrument processing, batch algorithm development/validation, and historical data reprocessing
  • Design the data handoff between instrument-side processing and
  • HPC/cloud compute — formats, staging, transfer protocols
  • Define storage tiering requirements for the processing pipeline — what data stays hot for active processing, what moves to warm for algorithm development access, and what archives to cold
  • Specify when and how workloads should burst from local HPC to cloud (AWS) based on pipeline load and priority
  • Optimize data movement across high-speed networks (RDMA,
  • InfiniBand, high-speed Ethernet) between instrument, HPC, and storage
  • Design for scalability — the architecture must accommodate increasing instrument throughput, additional instruments, and growing algorithm complexity

Reliability & Observability

  • Instrument every pipeline stage with metrics, logging, and tracing
  • Build real-time dashboards showing pipeline health, throughput, latency, and queue state
  • Design automated recovery mechanisms for common failure modes
  • Implement data integrity checks and validation at pipeline stage boundaries
  • Support root-cause analysis and post-mortem investigation for pipeline incidents
  • Establish runbooks and operational procedures for pipeline operations

Qualifications

Education:

BS/MS in Computer Science, Electrical Engineering, or related field.

PhD preferred.

Required:

Experience & Technical Leadership

  • 12 years of professional software engineering experience in

performance-critical systems

  • Track record of architecting and delivering complex, multi-stage data processing pipelines
  • Demonstrated technical leadership — ability to drive architecture decisions and mentor engineers
  • Experience operating systems at industrial-grade reliability and throughput requirements
  • Systems Programming & GPU Computing
  • Expert-level C/C and systems programming on Linux
  • Solid experience with CUDA programming and GPU pipeline development (required)
  • Strong understanding of computer architecture: CPU caches,
  • NUMA, memory hierarchies, PCIe, DMA
  • Experience with Python for tooling, orchestration, and pipeline glue
  • Experience with performance profiling and diagnostics tools (perf, ftrace, Nsight, or similar)
  • Pipeline & Orchestration
  • Experience designing multi-stage data pipelines with flow control, buffering, and backpressure management
  • Strong understanding of error handling, retry strategies, and fault recovery in performance-critical systems
  • Experience with job scheduling and work distribution across heterogeneous compute resources
  • Familiarity with workflow orchestration frameworks (Airflow, Celery, custom solutions, or similar) is a plus
  • Signal Processing & Algorithms
  • Practical experience implementing DSP or image processing algorithms in production systems
  • Familiarity with frequency-domain analysis, filtering, and detection algorithms
  • Ability to reason about numerical accuracy and throughput tradeoffs

Data Movement, Storage & Networking

  • Experience optimizing data transfer across high-speed networks (RDMA, InfiniBand, high-speed Ethernet)
  • Understanding of shared storage architectures, tiered storagestrategies, and high- throughput data staging
  • Experience defining compute platform requirements and collaborating effectively with infrastructure teams
  • Familiarity with algorithm deployment and versioning in production compute environments

Preferred:

  • Experience with high-throughput diagnostic instrument, imaging, or scientific instrument data pipelines
  • Experience scaling a data pipeline through multiple hardware or throughput generations
  • Experience with GPUDirect RDMA or other hardware offload technologies
  • Familiarity with real-time or low-latency Linux variants
  • Background in scientific computing, computational physics, or bioinformatics
  • Experience designing systems that span embedded instrument software and datacenter infrastructure

What Success Looks Like

  • The end-to-end pipeline from image acquisition to processed output runs continuously and reliably at target throughput
  • Backpressure and error handling work transparently — operators are not firefighting pipeline stalls
  • Job orchestration seamlessly distributes work across local and cloud compute based on load and priority
  • Pipeline performance is predictable, measurable, and well understood with clear per-stage metrics
  • New instrument generations with higher data rates can be accommodated through evolution, not redesign
  • Adding instruments to the lab scales the pipeline without disproportionate complexity or operational burden
  • Algorithm developers can deploy, test, and validate new algorithms on the HPC platform without disrupting production processing
  • Storage tiering keeps the right data accessible at the right cost as volumes grow

Compensation Range: $260K - $270K

Salary : $258,000 - $275,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Principal Engineer, High Performance Data & Algorithm Infrastructure?

Sign up to receive alerts about other jobs on the Principal Engineer, High Performance Data & Algorithm Infrastructure career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,929 - $122,443
Income Estimation: 
$122,257 - $154,284
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Foresite Labs

  • Foresite Labs San Diego, CA
  • Location: San Diego, CA | Full-Time Salary Range: $174,000 - $185,000 Position Overview We are building a high-throughput data analysis pipeline from first... more
  • 14 Days Ago

  • Foresite Labs San Diego, CA
  • Location: San Diego, CA | Full Time | Salary Range: $200,000 - $215,000 Position Overview We are building a real-time, high-throughput data analysis pipeli... more
  • 14 Days Ago


Not the job you're looking for? Here are some other Principal Engineer, High Performance Data & Algorithm Infrastructure jobs in the San Diego, CA area that may be a better fit.

  • Jobs via Dice Chula Vista, CA
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, ChaTeck Incorporated, is seeking the following. Apply ... more
  • 3 Days Ago

  • ChaTeck Incorporated Chula Vista, CA
  • Role: Principal Network Infrastructure Engineer Location: San Diego, CA Let’s create our future together at The AES Group! About The AES Group The AES Grou... more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!