Demo

Research Intern, Agent RL Training

newsbreak
Mountain View, CA Intern
POSTED ON 5/28/2026
AVAILABLE BEFORE 7/28/2026

About NewsBreak

Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active users, our flagship platform delivers highly personalized local news and information powered by advanced AI, recommendation systems, and adtech.

Recognized by Fast Company as #32 on the Top Workplaces for Innovators, we're proud to be Great Place to Work® certified and home to a dynamic team of technologists, product innovators, and business leaders who are passionate about solving meaningful challenges at scale.

Together, we reached unicorn status in 2021, and we remain committed to continuing this high-growth trajectory with the right team to fulfill our mission: building the infrastructure layer for content intelligence.

If you’re inspired to dream big, innovate fast, and make a difference, we’d love to hear from you! For more information, visit www.newsbreak.com/about

About the Role

We are looking for a Research Intern to join our Agent RL Training team. You will be paired with a full-time employee as your mentor, working together to explore, from zero to one, how to apply large language models to NewsBreak’s core business, including content understanding, recommendation, agentic web browsing, and autonomous multi-step task completion.

This is a hands-on research role. You are expected to independently drive experiments, propose novel ideas, and iterate quickly. We value self-starters with deep intellectual curiosity and the drive to push boundaries in LLM post-training and agent capabilities.

Location: Onsite in Mountain View, CA office

What You’ll Work On

  • Collaborate with your full-time mentor to identify high-impact research directions for applying LLMs to NewsBreak’s products
  • Independently run end-to-end SFT experiments on LLM-based agents, and assist with RL-related exploration such as reward design and training iteration
  • Curate and build high-quality training datasets: instruction-following, preference pairs, agent trajectories, and synthetic data
  • Contribute to public publications; we encourage and support top-venue submissions during your internship

What We’re Looking For

Requirements

  • Highly motivated and committed: willing to put in extra hours when needed to push projects across the finish line
  • Genuine passion for research: you read papers for fun, tinker with models on weekends, and care deeply about advancing the field
  • Independently capable of end-to-end model SFT: with basic understanding of RL-based post-training methods (RLHF, DPO, PPO, GRPO, etc.)
  • Excellent taste in model behavior: able to reason about what “good” looks like across user-facing domains and articulate why
  • Strong Python and PyTorch skills

Preferred Qualifications

  • Publication at a top-tier venue (NeurIPS, ICML, ICLR, ACL, EMNLP, or equivalent)
  • Experience with multi-node distributed training (FSDP, DeepSpeed, Megatron-LM)
  • Proficiency in writing custom GPU kernels with Triton or CUDA
  • Experience building synthetic data pipelines for agent training
  • Familiarity with open-source RL frameworks: TRL, OpenRLHF, veRL/vLLM

Hourly Pay:  $35- $50 

The US base salary range for this full-time position is listed below. Pay may vary based on a number of factors including job-related skills, level, experience, geographic location and relevant education or training. At NewsBreak, we design our overall rewards package to attract top talents. Depending on the position, the role may also be eligible for discretionary bonus and options. Your recruiter can share more details during the hiring process.

Annual Base Pay Range

$35 - $50 USD

CPRA Privacy Notice for California Candidates

 

Salary : $35 - $50

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Research Intern, Agent RL Training?

Sign up to receive alerts about other jobs on the Research Intern, Agent RL Training career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$102,775 - $137,396
Income Estimation: 
$153,127 - $203,425
Income Estimation: 
$139,626 - $193,276
Income Estimation: 
$164,650 - $211,440
Income Estimation: 
$130,030 - $173,363
Income Estimation: 
$47,666 - $67,425
Income Estimation: 
$68,185 - $91,707
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at newsbreak

  • newsbreak Mountain View, CA
  • About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active use... more
  • 5 Days Ago

  • newsbreak Mountain View, CA
  • About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active use... more
  • 5 Days Ago

  • newsbreak Mountain View, CA
  • About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active use... more
  • 6 Days Ago

  • newsbreak York, NY
  • About NewsBreak Founded in 2015, NewsBreak is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active use... more
  • 11 Days Ago


Not the job you're looking for? Here are some other Research Intern, Agent RL Training jobs in the Mountain View, CA area that may be a better fit.

  • snorkelai Redwood, CA
  • About Snorkel At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data. We’re on a mission to help enterprises transform ... more
  • 4 Days Ago

  • ChatGPT Jobs Redwood, CA
  • Job Description Research Scientist - Reinforcement Learning for LLMs Location: Redwood City, CA Work Model: On-site, Remote About The Role Snorkel is seeki... more
  • 17 Days Ago

AI Assistant is available now!

Feel free to start your new journey!