Demo

Engineering Manager, Evaluation Platform

ChatGPT Jobs
Austin, TX Full Time
POSTED ON 6/2/2026
AVAILABLE BEFORE 7/1/2026
Job Description

Engineering Manager, Evaluation Platform

Location: Austin, TX

  • On-site (2 days per week hybrid in Austin office)

Company: Procore (Construction Intelligence organization)

Reports to: Sr Director, Procore AI Engineering

Machine Learning & Artificial Intelligence

Job Summary

Build infrastructure and tooling to measure, benchmark, and improve the quality of AI agents (Search Agent, RFI Create Agent, Invoice Agent, etc.). Own end-to-end evaluation lifecycle: defining quality metrics, building evaluation frameworks, and delivering interfaces for actionable insights.

What You'll Do

  • Lead and grow a team of engineers focused on evaluation infrastructure, quality measurement, and developer tooling for AI agents.
  • Define technical vision and roadmap for the Evaluation Platform (offline evaluations and online evaluations).
  • Partner with AI/ML, Product, and Agent teams to define quality metrics (relevance, accuracy, latency, safety, user satisfaction, token usage) and build automated pipelines.
  • Design and deliver user-facing evaluation tools for assessing agent output quality, comparing model versions, and identifying regressions.
  • Build frameworks for human-in-the-loop evaluation (annotation workflows, rating interfaces, inter-rater reliability).
  • Establish CI/CD quality gates for agent version releases.
  • Drive engineering excellence (code quality, system reliability, test coverage, on-call health, technical debt management).
  • Recruit, mentor, and develop engineers, fostering a culture of ownership and rigorous experimentation.

What We're Looking For

  • 5 years managing engineering teams or as a technical lead, with 7 years total in software engineering.
  • Experience building evaluation, quality measurement, or observability platforms for LLM-based or agentic systems (RAG pipelines, multi-step agents, tool-use agents).
  • Strong understanding of evaluation methodologies (precision/recall, LLM-as-judge, human annotation, A/B testing, statistical significance).
  • Proven ability to translate ambiguous problem spaces into clear technical strategies and executable roadmaps.
  • Hands-on technical depth in backend systems, data pipelines, or distributed infrastructure (Python, Go, or similar).
  • Familiarity with evaluation frameworks such as RAGAS, DeepEval, LangFuse, or custom eval harnesses.
  • Background in search relevance (NDCG, MRR) or information retrieval quality systems.
  • Experience with construction-tech, procurement, or enterprise B2B SaaS domains (preferred).

Compensation & Benefits

Base Pay Range: $168,560.00 - $231,770.00 USD Annual

Machine Learning & Artificial Intelligence

Eligible for Equity Compensation and/or Bonus Incentive Compensation. Actual compensation based on job-related skills, experience, education/training, and location.

For Los Angeles County (unincorporated) Candidates: Procore will consider for employment all qualified applicants, including those with arrest or conviction records, in accordance with applicable laws.

Salary : $168,560 - $231,770

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Engineering Manager, Evaluation Platform?

Sign up to receive alerts about other jobs on the Engineering Manager, Evaluation Platform career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$151,448 - $188,145
Income Estimation: 
$203,425 - $249,816
Income Estimation: 
$213,375 - $267,876
Income Estimation: 
$190,687 - $235,769
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at ChatGPT Jobs

  • ChatGPT Jobs Boise, ID
  • Job Description Senior AI/ML Engineer (Vector Store & Retrieval Systems) Location: Boise, ID Remote (Remote) Experience: 10 years in AI/ML or Data Engineer... more
  • 3 Days Ago

  • ChatGPT Jobs Denver, CO
  • Job Description Job Information Member of Technical Staff (Zara) Location: Denver, CO Remote Job Type: Full-Time Experience Level: Not Specified Job Summar... more
  • 3 Days Ago

  • ChatGPT Jobs San Francisco, CA
  • Job Description AI Safety Researcher - Healthcare & High-Risk Failure Modes (Volunteer) Location: San Francisco, CA Remote (Fully Remote) Compensation: Unp... more
  • 3 Days Ago

  • ChatGPT Jobs San Francisco, CA
  • Job Description SpeechLLM Engineer / AI Research Engineer San Francisco, CA On-site Plaud Inc. Company Overview Plaud is building the world's most trusted ... more
  • 3 Days Ago


Not the job you're looking for? Here are some other Engineering Manager, Evaluation Platform jobs in the Austin, TX area that may be a better fit.

  • Apple, Inc. Austin, TX
  • At Apple, new insights often become revolutionary products, services, and customer experiences very quickly. Bring passion and dedication to your job, and ... more
  • 3 Days Ago

  • UFCU Main Austin, TX
  • Job Summary University Federal Credit Union (UFCU) is seeking a Manager, API Platform Engineering to lead a technical team to deliver UFCU’s enterprise ser... more
  • 8 Days Ago

AI Assistant is available now!

Feel free to start your new journey!