Demo

QA Automation

Jobs via Dice
San Francisco, CA Full Time
POSTED ON 5/25/2026
AVAILABLE BEFORE 6/23/2026
Description

## **LLM Evaluation Analyst**

### About the Role

MUST HAVE 3-4 YEARS OF PLAYWRIGHT EXPERIENCE WITH A GREAT UNDERSTANDING WITH LLMS

We are seeking **3 Evaluation Analysts** to assess the performance of AI models tasked with implementing web features. Your work directly informs whether AI-generated code is correct, whether the instructions given to the models are clear, and whether the testing frameworks used to evaluate them are fair and reliable.

### Core Responsibilities

You will analyze the quality of the entire evaluation pipeline-from the instructions given to the AI through to the final score-across the following categories:

  • **Model Capability** - Assess how well the AI performed each task by reviewing generated transcripts and results.
  • **Bug Discovery Value** - Identify patterns in AI code failures to understand why the model is making specific mistakes.
  • **Score Health** - Ensure tests are properly configured to apply correct scores during evaluation.
  • **Task Specification Quality** - Verify that prompts given to the AI are clear, correct, and technically precise (e.g., consistent variable names).
  • **Test-by-Test Analysis** - Evaluate the quality of automated tests using metrics like precision and recall to ensure they accurately measure AI performance.
  • **Platform Issues** - Report bugs or problems within the evaluation system itself, particularly with automated browser testing services.

### Required Qualifications

  • Expertise in finding patterns and issues in generative AI/LLM outputs
  • Direct experience with labeling and scoring frameworks
  • Experience writing Playwright tests
  • Strong analytical and problem-solving skills
  • Ability to interpret model behavior and articulate failure modes clearly

### Preferred Qualifications

  • Advanced analytical credentials - highly relevant for interpreting model behavior
  • Familiarity with web development concepts (HTML, CSS, JavaScript)
  • Experience with automated testing and grading systems
  • Background in quality assurance or evaluation methodology

Skills

Playwright

Top Skills Details

Playwright

Experience Level

Intermediate Level

Job Type & Location

This is a Contract position based out of San Francisco, CA.

Pay And Benefits

The pay range for this position is $35.00 - $50.00/hr.

Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:

Medical, dental & vision

Critical Illness, Accident, and Hospital

401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available

Life Insurance (Voluntary Life & AD&D for the employee and dependents)

Short and long-term disability

Health Spending Account (HSA)

Transportation benefits

Employee Assistance Program

Time Off/Leave (PTO, Vacation or Sick Leave)

Workplace Type

This is a fully remote position.

Application Deadline

This position is anticipated to close on May 29, 2026.

About TEKsystems

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

About TEKsystems And TEKsystems Global Services

We're a leading provider of business and technology services. We accelerate business transformation for our customers. Our expertise in strategy, design, execution and operations unlocks business value through a range of solutions. We're a team of 80,000 strong, working with over 6,000 customers, including 80% of the Fortune 500 across North America, Europe and Asia, who partner with us for our scale, full-stack capabilities and speed. We're strategic thinkers, hands-on collaborators, helping customers capitalize on change and master the momentum of technology. We're building tomorrow by delivering business outcomes and making positive impacts in our global communities. TEKsystems and TEKsystems Global Services are Allegis Group companies. Learn more at TEKsystems.com.

The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

San Francisco Fair Chance Ordinance: Pursuant to the San Francisco Fair Chance Ordinance, for all positions located in the city and county of San Francisco, we will consider for employment qualified applicants with arrest and conviction records.

Massachusetts Lie Detector: It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

Use of Artificial Intelligence (AI): We may use Artificial Intelligence (AI) to support parts of our hiring process, including sourcing, screening, and evaluating candidates. AI helps assess applications and qualifications, but final decisions are made by our hiring team. By applying, you acknowledge and agree that your application may be reviewed using AI tools.

Salary : $35 - $50

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a QA Automation?

Sign up to receive alerts about other jobs on the QA Automation career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$79,991 - $102,697
Income Estimation: 
$102,492 - $128,675
Income Estimation: 
$78,244 - $97,353
Income Estimation: 
$99,704 - $121,947
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice Alaska, AK
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Last Word Consulting, is seeking the following. Apply ... more
  • 1 Day Ago

  • Jobs via Dice Fargo, ND
  • Brand New Civil Engineer Opening With Leader In Land Development, Utilities and Drainage Design! This Jobot Job is hosted by: Brian Perkins Are you a fit? ... more
  • 1 Day Ago

  • Jobs via Dice Honolulu, HI
  • Job Number: R0235002 UAS Software Engineer The Opportunity: As a UAS Software Engineer, you will design, develop, and deploy machine learning models that p... more
  • 1 Day Ago

  • Jobs via Dice Honolulu, HI
  • Honolulu, HI, United States (On-site) Responsibilities TEKsystems is supporting a local banking organization seeking an AI expert to help lead adoption and... more
  • 1 Day Ago


Not the job you're looking for? Here are some other QA Automation jobs in the San Francisco, CA area that may be a better fit.

  • TEKsystems San Francisco, CA
  • Description ## **LLM Evaluation Analyst** ### About the Role MUST HAVE 3-4 YEARS OF PLAYWRIGHT EXPERIENCE WITH A GREAT UNDERSTANDING WITH LLMS We are seeki... more
  • 15 Days Ago

  • kikoff San Francisco, CA
  • Kikoff: A FinTech Unicorn Powering Financial Progress with AI At Kikoff, our mission is to provide radically affordable financial tools to help consumers a... more
  • 3 Days Ago

AI Assistant is available now!

Feel free to start your new journey!