Demo

Tech Lead Data Scientist, AI Evaluation & Monitoring

Geisinger
Danville, PA Full Time
POSTED ON 4/30/2026
AVAILABLE BEFORE 6/30/2026

What You Will Own: 

  • The technical evaluation methodology applied to AI programs across the enterprise, pre-production validation, production monitoring, and ongoing optimization 
  • Hands-on guidance to program teams as they design validation studies, equity audits, monitoring plans, and escalation playbooks for their AI systems 
  • Instrumentation of production monitoring: translating program-specific failure modes into concrete, measurable metrics 
  • The evaluation toolkit: LLM-as-Judge frameworks, golden sets, simulation harnesses, experimental study designs, drift detection, subgroup fairness analysis 
  • Reusable evaluation playbooks and templates that let each new program move faster than the last 
  • Technical direction, design review, and mentorship for a team of data analysts supporting the evaluation function 

What You Will Not Own: 

  • People management, HR administration, or formal performance evaluations for the analyst team (those sit with the analysts' line manager; the Tech Lead provides technical input) 
  • Program-level product strategy or go/no-go decisions 
  • Final clinical validation judgment on whether a given AI is safe for a given clinical use 
  • The software infrastructure behind the evaluation and monitoring tooling (built by the AI Platform team — the Tech Lead defines what's measured and how; Platform builds the backend) 

Shape of the Work:

This is a role that lives at three altitudes at once: 

With program teams (hands-on advisory). Partner with program owners early, before evaluations are designed, to shape study approach, sample size, stratification, gold-standard definition, and decision thresholds. Translate ambiguous failure modes into concrete, defensible evaluation designs. Coach teams through the technical work so that what arrives at governance review is rigorous, not performative. 

With the evaluation toolkit (hands-on build). Design and operate the reusable assets that let evaluation scale: LLM-as-Judge rubrics and calibration methods, golden sets, simulation harnesses, A/B and shadow-mode study templates, subgroup fairness analyses, and drift monitors. Keep a pragmatic eye on what actually works in a clinical environment versus what works in a paper. 

With the analyst team (technical leadership). Set technical direction, assign work across active evaluations, review analysis code and study designs, and raise the technical bar. Mentor analysts on methodology, statistical rigor, and the domain knowledge that makes evaluation credible. Grow them from execution into independent evaluation design. 

Methods You'll Use: 

  • Experimental and quasi-experimental design for production AI systems 
  • LLM and generative AI evaluation: golden sets, judge-based evaluation, hallucination and grounding checks 
  • Fairness and equity evaluation across patient and stakeholder subgroups 
  • Production monitoring design: drift detection, performance decay, adoption, and outcome metrics 
  • Causal inference methods appropriate to healthcare settings where full RCTs are often impractical 
  • Simulation and adversarial testing for pre-production stress testing 
  • Python, SQL, modern ML and evaluation tooling, cloud-native data platforms 

Work is typically performed in an office or remote environment. Accountable for satisfying all job specific obligations and complying with all organization policies and procedures. The specific statements in this profile are not intended to be all-inclusive. They represent typical elements considered necessary to successfully perform the job.

*Relevant experience may be a combination of related work experience and degree obtained (Master's Degree = 2 years; PHD = 4 years ).

Qualifications:

Required Skills & Qualifications: 

  • 6 years in data science, statistics, ML engineering, or applied quantitative research, with demonstrated experience as the senior technical voice on cross-functional projects 

  • Strong foundation in experimental design and causal inference — and judgment about which method fits which situation 

  • Hands-on experience designing and running model evaluation studies in real production settings 

  • Experience evaluating LLM or generative AI systems, or comparable experience evaluating complex ML systems where ground truth is messy 

  • Proven ability to translate ambiguous failure modes into concrete, defensible evaluation designs and monitoring metrics 

  • Strong fluency in Python and SQL; working comfort with modern ML tooling and cloud-native data environments 

  • Experience with fairness and equity evaluation for ML systems 

  • Track record of providing technical leadership and mentorship without formal people-management authority 

  • Clear written communication — the role produces evaluation memos and specifications that non-technical decision-makers rely on 

  • Healthcare, clinical, or regulated-industry experience strongly preferred 

  • MS or PhD in a quantitative field preferred; equivalent experience accepted

Salary.com Estimation for Tech Lead Data Scientist, AI Evaluation & Monitoring in Danville, PA
$91,390 to $111,487
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Tech Lead Data Scientist, AI Evaluation & Monitoring?

Sign up to receive alerts about other jobs on the Tech Lead Data Scientist, AI Evaluation & Monitoring career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$116,765 - $144,626
Income Estimation: 
$142,836 - $179,016
Income Estimation: 
$63,888 - $78,999
Income Estimation: 
$79,521 - $98,503
Income Estimation: 
$107,442 - $160,602
Income Estimation: 
$83,941 - $108,028
Income Estimation: 
$92,657 - $120,748
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Geisinger

  • Geisinger Muncy, PA
  • Job Summary Geisinger is proud to offer a Graduate RN a hiring incentive up to $19,000 for eligible candidates! $41 minimum hiring rate with increased rate... more
  • 1 Day Ago

  • Geisinger Danville, PA
  • Geisinger and Acadia Healthcare have formed a joint venture partnership to build and operate two behavioral health facilities, one in each of its Central a... more
  • 1 Day Ago

  • Geisinger Hazleton, PA
  • Why Join Geisinger? Physician-led organization that prioritizes high-quality care, innovation, and provider voice Practice in a smaller, community-focused ... more
  • 1 Day Ago

  • Geisinger Scranton, PA
  • To learn more about Geisinger's Northeast Campuses, Click Here Perks of Joining Geisinger as a Graduate LPN! Offer GLPNs up to a 6-8 months in advance to g... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Tech Lead Data Scientist, AI Evaluation & Monitoring jobs in the Danville, PA area that may be a better fit.

  • Geisinger Danville, PA
  • ​What You Will Own: Solution architecture across all platform capabilities (agentic AI systems, RAG pipelines, multi-model orchestration, real-time and bat... more
  • 2 Months Ago

  • Geisinger Danville, PA
  • Job Summary The Senior Data Scientist is a strategic leader in our organization, driving the entire lifecycle of data science initiatives that directly imp... more
  • 5 Days Ago

AI Assistant is available now!

Feel free to start your new journey!