What are the responsibilities and job description for the Lead AI QA position at JSG (Johnson Service Group, Inc.)?
Our client is seeking a Senior-level AI Quality Assurance Leader specializing in Generative AI, LLM systems, and AI agents, responsible for defining and driving end-to-end quality strategy for scalable and responsible AI deployments. (Remote, USA)
Must Have Skills
Must Have Skills
- Generative AI (GenAI) system testing – 7 years overall QA experience with hands-on GenAI validation
- Large Language Models (LLMs) – 5 years experience validating LLM outputs for accuracy, safety, and bias
- Retrieval-Augmented Generation (RAG) systems – 5 years experience testing pipeline performance and retrieval quality
- Azure AI Foundry – 3 years experience testing Azure-based AI solutions
- LangGraph – 3 years experience validating orchestration and multi-agent workflows
- Test Automation Frameworks for AI Systems – 5 years experience building automated validation and evaluation pipelines
- LangSmith evaluation workflows – 3 years experience in LLM evaluation and monitoring
- Multi-agent AI architectures – 3 years experience testing agent coordination and decision logic
- Responsible AI practices – 5 years experience in bias detection, safety validation, and compliance testing
- Performance and load testing for AI systems – 5 years experience validating scalability of AI services
- CI/CD integration for AI pipelines – 5 years experience embedding QA into deployment workflows
- AI evaluation metrics design (BLEU, ROUGE, custom scoring, etc.) – 3 years experience defining quality benchmarks
- Define and lead QA strategy for Generative AI pipelines, RAG systems, and multi-agent workflows
- Validate LLM outputs for accuracy, safety, bias, and performance across environments
- Oversee quality assurance of Azure AI Foundry-based AI solutions
- Ensure quality across LangGraph orchestration and LangSmith evaluation workflows
- Establish automated testing frameworks and AI-specific evaluation metrics
- Lead QA teams to ensure scalable, reliable, and responsible AI deployments