Demo

Research Scientist / Engineer - Efficient Modeling

Gigascale Capital
Palo Alto, CA Full Time
POSTED ON 5/25/2026
AVAILABLE BEFORE 6/24/2026
Location

Palo Alto

Employment Type

Full time

Department

Research

OverviewApplication

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality.

We're looking for a Research Scientist or Research Engineer focused on model efficiency — making our foundation world models faster, smaller, and more deployable without sacrificing capability. This work is critical to closing the gap between research-scale models and real-time operation on robot hardware.

What You'll Do

  • Research and implement model compression techniques: quantization, pruning, structured sparsity, distillation, and low-rank approximation
  • Design efficient architectures and attention mechanisms suited to real-time inference on edge and robot hardware
  • Develop training strategies that produce better accuracy-efficiency tradeoffs from the start
  • Profile and benchmark models across hardware targets to identify and resolve efficiency bottlenecks
  • Build evaluation frameworks that measure capability retention after compression or architecture changes
  • Collaborate with training systems and deployment teams to ensure efficient models translate to faster real-world inference
  • Publish and present work at top-tier venues (especially valued for RS track)

What We're Looking For

  • Strong understanding of model compression and efficient architectures for large models
  • Hands-on experience with quantization, distillation, or pruning applied to transformers or large neural networks
  • Deep knowledge of where efficiency gains are possible in modern architectures
  • Proficiency with PyTorch and familiarity with hardware-aware optimization (CUDA, TensorRT, or similar)
  • Ability to run principled experiments that characterize capability-efficiency tradeoffs

Nice To Have (But Not Required)

  • PhD in ML, CS, or a related field — or equivalent research/engineering experience
  • Publication record at NeurIPS, ICML, ICLR, MLSys, or related venues
  • Experience with efficient video or multimodal model architectures
  • Familiarity with edge deployment targets (Jetson, custom ASICs, or mobile hardware)
  • Prior work on speculative decoding, early exit, or adaptive compute
  • Experience deploying compressed models on physical robots or latency-constrained systems

Why This Role

  • Bridge the gap between large-scale research models and real-time robot deployments
  • Your work determines whether frontier capabilities actually run on our hardware
  • High leverage: efficiency improvements benefit every model the team trains and deploys
  • Work at a rare intersection of deep learning research and systems

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality.

We're looking for a Research Scientist or Research Engineer focused on model efficiency — making our foundation world models faster, smaller, and more deployable without sacrificing capability. This work is critical to closing the gap between research-scale models and real-time operation on robot hardware.

What You'll Do

  • Research and implement model compression techniques: quantization, pruning, structured sparsity, distillation, and low-rank approximation
  • Design efficient architectures and attention mechanisms suited to real-time inference on edge and robot hardware
  • Develop training strategies that produce better accuracy-efficiency tradeoffs from the start
  • Profile and benchmark models across hardware targets to identify and resolve efficiency bottlenecks
  • Build evaluation frameworks that measure capability retention after compression or architecture changes
  • Collaborate with training systems and deployment teams to ensure efficient models translate to faster real-world inference
  • Publish and present work at top-tier venues (especially valued for RS track)

What We're Looking For

  • Strong understanding of model compression and efficient architectures for large models
  • Hands-on experience with quantization, distillation, or pruning applied to transformers or large neural networks
  • Deep knowledge of where efficiency gains are possible in modern architectures
  • Proficiency with PyTorch and familiarity with hardware-aware optimization (CUDA, TensorRT, or similar)
  • Ability to run principled experiments that characterize capability-efficiency tradeoffs

Nice To Have (But Not Required)

  • PhD in ML, CS, or a related field — or equivalent research/engineering experience
  • Publication record at NeurIPS, ICML, ICLR, MLSys, or related venues
  • Experience with efficient video or multimodal model architectures
  • Familiarity with edge deployment targets (Jetson, custom ASICs, or mobile hardware)
  • Prior work on speculative decoding, early exit, or adaptive compute
  • Experience deploying compressed models on physical robots or latency-constrained systems

Why This Role

  • Bridge the gap between large-scale research models and real-time robot deployments
  • Your work determines whether frontier capabilities actually run on our hardware
  • High leverage: efficiency improvements benefit every model the team trains and deploys
  • Work at a rare intersection of deep learning research and systems

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality.

We're looking for a Research Scientist or Research Engineer focused on model efficiency — making our foundation world models faster, smaller, and more deployable without sacrificing capability. This work is critical to closing the gap between research-scale models and real-time operation on robot hardware.

What You'll Do

  • Research and implement model compression techniques: quantization, pruning, structured sparsity, distillation, and low-rank approximation
  • Design efficient architectures and attention mechanisms suited to real-time inference on edge and robot hardware
  • Develop training strategies that produce better accuracy-efficiency tradeoffs from the start
  • Profile and benchmark models across hardware targets to identify and resolve efficiency bottlenecks
  • Build evaluation frameworks that measure capability retention after compression or architecture changes
  • Collaborate with training systems and deployment teams to ensure efficient models translate to faster real-world inference
  • Publish and present work at top-tier venues (especially valued for RS track)

What We're Looking For

  • Strong understanding of model compression and efficient architectures for large models
  • Hands-on experience with quantization, distillation, or pruning applied to transformers or large neural networks
  • Deep knowledge of where efficiency gains are possible in modern architectures
  • Proficiency with PyTorch and familiarity with hardware-aware optimization (CUDA, TensorRT, or similar)
  • Ability to run principled experiments that characterize capability-efficiency tradeoffs

Nice To Have (But Not Required)

  • PhD in ML, CS, or a related field — or equivalent research/engineering experience
  • Publication record at NeurIPS, ICML, ICLR, MLSys, or related venues
  • Experience with efficient video or multimodal model architectures
  • Familiarity with edge deployment targets (Jetson, custom ASICs, or mobile hardware)
  • Prior work on speculative decoding, early exit, or adaptive compute
  • Experience deploying compressed models on physical robots or latency-constrained systems

Why This Role

  • Bridge the gap between large-scale research models and real-time robot deployments
  • Your work determines whether frontier capabilities actually run on our hardware
  • High leverage: efficiency improvements benefit every model the team trains and deploys
  • Work at a rare intersection of deep learning research and systems

Salary.com Estimation for Research Scientist / Engineer - Efficient Modeling in Palo Alto, CA
$137,993 to $172,582
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Research Scientist / Engineer - Efficient Modeling?

Sign up to receive alerts about other jobs on the Research Scientist / Engineer - Efficient Modeling career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$108,245 - $136,486
Income Estimation: 
$136,683 - $171,343
Income Estimation: 
$108,245 - $136,486
Income Estimation: 
$136,683 - $171,343
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Gigascale Capital

  • Gigascale Capital Palo Alto, CA
  • Location Palo Alto Employment Type Full time Department Research OverviewApplication At Rhoda AI, we’re building the next generation of generalist intellig... more
  • 1 Day Ago

  • Gigascale Capital Berkeley, CA
  • Privacy Overview This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorize... more
  • 1 Day Ago

  • Gigascale Capital El Segundo, CA
  • Location El Segundo, CA Employment Type Full time Location Type On-site Department OperationsBusiness Development Compensation $115K – $168K Offers Equity ... more
  • 2 Days Ago

  • Gigascale Capital San Bruno, CA
  • Mill is a waste prevention technology company reimagining what it means to eliminate waste, starting with food. We build smart systems and infrastructure f... more
  • 2 Days Ago


Not the job you're looking for? Here are some other Research Scientist / Engineer - Efficient Modeling jobs in the Palo Alto, CA area that may be a better fit.

  • ByteDance San Jose, CA
  • Responsibilities About the Team The Vision-Applied Research team focuses on applied research in Generative AI and CV/Multimodal Understanding, and deliveri... more
  • 18 Days Ago

  • Gigascale Capital Palo Alto, CA
  • Location Palo Alto Employment Type Full time Department Research OverviewApplication At Rhoda AI, we’re building the next generation of generalist intellig... more
  • 1 Day Ago

AI Assistant is available now!

Feel free to start your new journey!