What are the responsibilities and job description for the Founding Engineer, ML Performance & Systems position at Stealth?

About Us

We’re an early-stage stealth startup building a new kind of platform for generative media. Our mission is to enable the future of real-time generative applications: we’re building the foundational tools and infrastructure that make entirely new categories of generative experiences and applications finally possible.

We’re a small, focused team of ex-YC and unicorn founders and senior engineers with deep experience across 3D, generative video, developer platforms, and creative tools. We're backed by top-tier investors and top angels, and we're building a new technical foundation purpose-built for the next era of generative media.

We’re operating at the edge of what’s technically possible: high-performance inference and real-time orchestration of multimodal models. As one of our founding engineers, you’ll play a key role in architecting the core platform, shaping system design decisions, and owning critical infrastructure from day one.

If you're excited about architecting and building high-performance infrastructure that empowers the next generation of developers and unlocks entirely new products categories, we’d love to talk.

About The Role

We’re looking for a Founding Engineer, ML Performance & Systems with deep expertise in high-performance ML infrastructure. This is a highly technical, high-impact role focused on squeezing every drop of performance from real-time generative media models.

You’ll work across the model-serving stack, designing novel architectures, optimizing inference performance, and shaping Reactor’s competitive edge in ultra-low-latency, high-throughput environments.

What You’ll Do

Drive our frontier position on real-time model performance for diffusion models
Design and implement a high-performance in-house inference engine
Focus on maximizing throughput and minimizing latency and resource usage
Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities

Requirements

About You

Strong foundation in systems programming, with a track record of identifying and resolving bottlenecks
Deep expertise in the ML infrastructure stack:

PyTorch, TensorRT, TransformerEngine, Nsight
Model compilation, quantization, and advanced serving architectures

Working knowledge of GPU hardware (NVIDIA) and the ability to dive deep into the stack as needed (e.g., writing custom GEMM kernels with CUTLASS)
Proficient in Triton or willing to learn, with comparable experience in low-level accelerator programming
Excited by the frontier of multi-dimensional model parallelism (e.g., combining tensor, context, and sequence parallelism)
Familiarity with internals of cutting-edge techniques such as Ring Attention, FA3, and FusedMLP implementations

Minimum Qualifications

Expertise in systems programming (C , CUDA)
Experience optimizing ML inference on GPUs
Proficient with PyTorch and tools like TensorRT
Deep understanding of NVIDIA GPU architecture
Familiar with model serving, compilation, and quantization

Benefits

Competitive SF Salary Equity

Salary : $160,000 - $200,000

Apply for this job

Receive alerts for other Founding Engineer, ML Performance & Systems job openings

Founding Engineer, ML Performance & Systems

What are the responsibilities and job description for the Founding Engineer, ML Performance & Systems position at Stealth?

What is the career path for a Founding Engineer, ML Performance & Systems?

Job openings at Stealth

Not the job you're looking for? Here are some other Founding Engineer, ML Performance & Systems jobs in the San Francisco, CA area that may be a better fit.

We don't have any other Founding Engineer, ML Performance & Systems jobs in the San Francisco, CA area right now.

AI Assistant is available now!