What are the responsibilities and job description for the Research Scientist / Engineer — Foundation Model (Image / Video) position at Luma?

About Luma AI

Luma’s mission is to build multimodal AGI. Through our research on video, 3D, and now multimodal models at Luma, we believe that AI needs to be jointly trained over all signal modalities – text, video, audio, images – analogous to the human brain.

To advance our mission, we build and operate the full stack end-to-end, spanning foundation models, inference systems, and products. This integrated approach powers technologies like Ray3, which is seeing rapidly growing adoption among Fortune 500 companies across media, entertainment, and advertising. Backed by a recent $900M Series C and our partnership with Humain to build a 2 GW compute supercluster (Project Halo), our models and the Dream Machine platform are now enabling creatives worldwide to tell some of the most impactful stories of our time.

Where You Come In

This is a rare and foundational opportunity to define the future of multimodal AI. You will be at the forefront of building and training large-scale multimodal models, directly impacting how users interact with pixels. This role offers the chance to bridge cutting-edge research with magical, shipped products, working end-to-end on novel problems with no existing playbook.

What You'll Do

This opportunity involves both the “science” and “engineering” parts of research, two aspects that are of equal importance.

This is a multi-stack opportunity where you will work on the intersection of modeling, data, systems, and evaluation.

Modeling: Architect large-scale multimodal models with a focus on pixel data. Solving challenges in both understanding of and generation of pixels.
Data: Hillclimbing existing tasks and formulating new tasks through data. Design, implement, and run robust data pipelines for constructing, enriching, and filtering massive pixel datasets.
Systems: Train large-scale multimodal models on massive datasets and GPU clusters.
Evaluation: Define and build novel evaluation frameworks to measure pixel intelligence, from understanding (recognition, grounding, spatial-temporal perception...) to generation (realism, consistency, controllability, and human-aligned creative quality).

Who You Are

Strong foundation in machine learning and foundation models, with experience in image, video, or multimodal domains.
Deep understanding of autoregressive, diffusion/flow-based, or hybrid/unified models.
Hands-on experience with PyTorch and large-scale training (distributed, mixed precision, large datasets).

What Sets You Apart (Bonus Points)

Experience in the following around data, modeling, or evaluation:

State-of-the-art foundation models in understanding pixels
State-of-the-art foundation models in generating pixels

Your application are reviewed by real people.

Apply for this job

Receive alerts for other Research Scientist / Engineer — Foundation Model (Image / Video) job openings

Job openings at Luma

Qualitative Evaluation Engineer

Apply

Luma Palo Alto, CA
About Luma AI Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical for intellig... more
6 Days Ago

Growth Marketing Lead

Apply

Luma Palo Alto, CA
About Luma Luma’s mission is to build unified general intelligence that can generate, understand, and operate in the physical world. We believe that multim... more
6 Days Ago

ML Engineer - Inference Serving

Apply

Luma Palo Alto, CA
About Luma AI Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical for intellig... more
9 Days Ago

Applied Research Scientist / Engineer

Apply

Luma Palo Alto, CA
About Luma AI Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical for intellig... more
9 Days Ago

Not the job you're looking for? Here are some other Research Scientist / Engineer — Foundation Model (Image / Video) jobs in the Palo Alto, CA area that may be a better fit.

Research Scientist / Engineer – Foundation Model: Core Research

Apply

Luma Palo Alto, CA
Where You Come In This is a rare and foundational opportunity to define the future of multimodal AI. You will be at the forefront of architecting the intel... more
15 Days Ago

Research Scientist / Engineer — Foundation Model (Voice Agents)

Apply

Luma Palo Alto, CA
About Luma AI Luma’s mission is to build multimodal AGI. Through our research on video, 3D, and now multimodal models at Luma, we believe that AI needs to ... more
15 Days Ago

Research Scientist / Engineer — Foundation Model (Image / Video)

What are the responsibilities and job description for the Research Scientist / Engineer — Foundation Model (Image / Video) position at Luma?

Job openings at Luma

Not the job you're looking for? Here are some other Research Scientist / Engineer — Foundation Model (Image / Video) jobs in the Palo Alto, CA area that may be a better fit.

We don't have any other Research Scientist / Engineer — Foundation Model (Image / Video) jobs in the Palo Alto, CA area right now.

AI Assistant is available now!