Demo

Applied Reinforcement Learning Engineer

Centific
Redmond, WA Full Time
POSTED ON 5/19/2026
AVAILABLE BEFORE 6/17/2026

Applied Reinforcement Learning Engineer

Location: Palo Alto, CA or Seattle, WA (Hybrid/Remote)

Salary: $150K – $300K Annually


About Centific

Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with 4,000 AI practitioners and engineers, and an integrated ecosystem of 1.8 million vertical domain experts across 230 markets. Our zero-distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.

About the Team

Centific AI Research advances foundational AI models and applications through reinforcement learning, alignment, and human-centered intelligence. We're building governed simulation environments that let enterprises safely iterate and improve AI agent workflows — bridging human-labeled signal creation with automated post-training for high-stakes operations.


The Role

You'll build simulation environments that mirror real enterprise workflows and post-train LLM agents inside them. Your environments, reward functions, and verifiers become the training ground for production agents handling document processing, compliance, customer operations, and multi-step reasoning across regulated industries.

This role sits at the intersection of LLM post-training research and production engineering. You'll translate customer workflows into bespoke environments, design reward signals that hold up under optimization pressure, and ship pipelines that turn human-labeled traces into measurable agent improvements.


What You'll Do

  • Design simulation environments and digital twins for enterprise workflows
  • Post-train LLM agents using the right method for the task — RLHF, DPO, GRPO, PPO, and whatever comes next
  • Build pipelines that turn human-labeled traces and verifiable signals into training data
  • Architect multi-turn, tool-using agents with closed learning loops
  • Design reward functions and verifiers that resist reward hacking and reflect real task outcomes
  • Translate research into production; contribute to publications


Required Qualifications

  • 3 years fine-tuning LLMs, with hands-on experience in RL post-training
  • Experience building or training LLM-based agents — tool use, multi-turn reasoning, trajectory evaluation
  • Strong Python and software engineering skills; comfortable building pipelines, not just notebooks
  • Working knowledge of modern post-training and rollout-serving libraries
  • MS/PhD in CS, ML, or related field, or equivalent industry experience


Preferred Qualifications

  • Publications at NeurIPS, ICML, ICLR, ACL, COLM, or similar venues
  • Open-source contributions to post-training or agent frameworks (TRL, veRL, OpenRLHF, SkyRL, or similar)
  • Background in classical RL
  • Domain experience in healthcare, finance, logistics, or compliance
  • Experience with synthetic data generation, simulation, or world models
  • Distributed training experience


Why Join Centific

  • Lead the frontier. Shape a new discipline at the intersection of post-training, simulation, and enterprise AI
  • Ship your science. See your research power real systems across healthcare, finance, and safety-critical operations
  • Collaborate with leaders. Work alongside NVIDIA, Microsoft, and the global AI community
  • Build what matters. Create governed, compliant AI systems enterprises can actually trust


Learn more about us at centific.com.

Centific is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, citizenship status, age, mental or physical disability, medical condition, sex (including pregnancy), gender identity or expression, sexual orientation, marital status, familial status, veteran status, or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories, consistent with legal requirements.

Salary : $150,000 - $300,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Applied Reinforcement Learning Engineer?

Sign up to receive alerts about other jobs on the Applied Reinforcement Learning Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Centific

  • Centific Seattle, WA
  • Title: AI Research Engineer: Vision AI / VLM / Physical AI Location: Seattle, WA (or Remote) Type: Full‑time Build the Future of Perception & Embodied Inte... more
  • 3 Days Ago

  • Centific Redmond, WA
  • Principal Research Scientist – Foundation Models for Vision AI & Physical AI Location: Seattle, WA or Palo Alto, CA (Hybrid/Remote) Full-time with Centific... more
  • 3 Days Ago

  • Centific Redmond, WA
  • Role: Program Manager - Data Collection Location: Redmond, WA - Remote Full-time with Centific Job Responsibilities The Program Manager is responsible for ... more
  • 6 Days Ago

  • Centific Redmond, WA
  • Position: AVP - Global Talent Acquisition Location: Redmond, WA/East Palo Alto, CA Full-time with Centific The Role We are looking for a transformational t... more
  • 6 Days Ago


Not the job you're looking for? Here are some other Applied Reinforcement Learning Engineer jobs in the Redmond, WA area that may be a better fit.

  • Bright Vision Technologies Bellevue, WA
  • Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and ... more
  • 7 Days Ago

  • Weights & Biases Bellevue, WA
  • CoreWeave, the AI Hyperscaler™, acquired Weights & Biases to create the most powerful end-to-end platform to develop, deploy, and iterate AI faster. Since ... more
  • 11 Days Ago

AI Assistant is available now!

Feel free to start your new journey!