Demo

Research Scientist, RL Training

ChatGPT Jobs
Redwood, CA Full Time
POSTED ON 5/17/2026
AVAILABLE BEFORE 6/15/2026
Job Description

Research Scientist - Reinforcement Learning for LLMs

Location: Redwood City, CA

Work Model: On-site, Remote

About The Role

Snorkel is seeking a Research Scientist to focus on reinforcement learning for training and aligning large language models (LLMs). This foundational research role aims to solve the open data problem of generating reliable training data, reward signals, and procedures to steer LLM behavior.

Main Responsibilities

  • Research and implement reinforcement learning techniques (GRPO, RLHF, RLAIF, DPO, reward modeling) to create data products for LLM training and fine-tuning.
  • Design and build data pipelines for generating high-quality training signals and improving model generalization.
  • Prototype and iterate on end-to-end RL training recipes.
  • Collaborate with research, engineering, and delivery teams to translate RL research into customer-ready data products.
  • Stay current with advancements in LLM training, alignment research, and scalable RL methods.
  • Contribute to research publications and internal knowledge base.

Preferred Qualifications

  • Deep expertise in reinforcement learning from human or AI feedback, reward modeling, and credit attribution.
  • Experience training or fine-tuning 30B LLMs at scale, including distributed training infrastructure.
  • Strong proficiency in Python and ML frameworks (PyTorch, HuggingFace) and RL frameworks (Verl, SkyRL).
  • Solid software engineering fundamentals for building reproducible research prototypes.
  • Familiarity with ML infrastructure and cloud platforms (AWS, GCP, Kubernetes, Slurm); experience with large-scale RL training pipelines is a plus.
  • Comfort with high-iteration, open-ended research environments and shifting constraints.
  • Ph.D. in machine learning, reinforcement learning, or related field, or equivalent industry experience.

Salary Range

$200,000 - $275,000 USD

Salary : $200,000 - $275,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Research Scientist, RL Training?

Sign up to receive alerts about other jobs on the Research Scientist, RL Training career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$108,245 - $136,486
Income Estimation: 
$136,683 - $171,343
Income Estimation: 
$56,489 - $71,327
Income Estimation: 
$70,310 - $88,223
Income Estimation: 
$66,679 - $90,237
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at ChatGPT Jobs

  • ChatGPT Jobs Washington, DC
  • Job Description Job Title: Artificial Intelligence AI Engineer – Gen AI, NLP Location: Washington, DC / Remote (Work-from-Home) Clearance: Must be a U.S. C... more
  • Just Posted

  • ChatGPT Jobs Charleston, WV
  • Job Description Software Engineer Location: Charleston, WV (Remote) Company: Agiloft Position Overview Engineers at Agiloft design, build, and deploy produ... more
  • Just Posted

  • ChatGPT Jobs Las Vegas, NV
  • Job Description Prompt Engineer II Location: Las Vegas, NV On-site, Remote Company Veteran Benefits Guide (VBG) – Founded by a U.S. Marine Corps Veteran, s... more
  • Just Posted

  • ChatGPT Jobs Midvale, UT
  • Job Description AI Agent Developer & Test Engineer Overview KēSTA I.T. is seeking an experienced AI Agent Developer & Test Engineer to join a cutting-edge ... more
  • Just Posted


Not the job you're looking for? Here are some other Research Scientist, RL Training jobs in the Redwood, CA area that may be a better fit.

  • snorkelai Redwood, CA
  • About Snorkel At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data. We’re on a mission to help enterprises transform ... more
  • 4 Days Ago

  • newsbreak Mountain View, CA
  • About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active use... more
  • 6 Days Ago

AI Assistant is available now!

Feel free to start your new journey!