What are the responsibilities and job description for the Senior LLM Evaluation Engineer (Claude / GenAI Platforms) position at Axiom Global Technologies?
Position Overview
We are seeking a highly experienced Senior LLM Evaluation Engineer to assess, validate, and optimize Claude (Anthropic) and other Generative AI platforms within an enterprise environment. The ideal candidate will have strong software engineering expertise, hands-on experience with Claude APIs and developer tools, and a proven track record of evaluating AI/LLM solutions for scalability, security, and business adoption.
Key Responsibilities
- Evaluate and benchmark Claude for Desktop capabilities, including Cowork, Code, CLI, APIs, and related developer tools.
- Perform hands-on testing, prototyping, and technical validation of AI/LLM platforms and workflows.
- Analyze integration architectures, API connectivity patterns, security controls, and scalability considerations.
- Identify strengths, limitations, risks, and optimization opportunities across GenAI solutions.
- Develop detailed technical findings, recommendations, and evaluation reports for stakeholders.
- Collaborate with engineering, architecture, and product teams to assess enterprise readiness and implementation strategies.
- Validate performance, reliability, governance, and compliance requirements for AI deployments.
Required Qualifications
- Senior-level software engineering, platform engineering, or application development experience.
- Demonstrated hands-on expertise with Claude (Anthropic) tools, APIs, and related developer workflows.
- Strong experience with API integrations, developer tooling, automation, and CLI-based environments.
- Proven experience evaluating, testing, and implementing AI/LLM platforms in enterprise settings.
- Excellent analytical, troubleshooting, and technical documentation skills.
- Ability to provide actionable recommendations based on technical assessments and proof-of-concept results.
Preferred Qualifications
- Experience with AI developer platforms, coding copilots, agentic AI frameworks, or orchestration tools.
- Familiarity with desktop application integrations and enterprise security/governance controls.
- Knowledge of cloud platforms, enterprise architecture, and AI deployment best practices.
- Experience working with multiple Generative AI models and ecosystems.
Skills
- Claude (Anthropic) APIs & Tooling
- Large Language Models (LLMs)
- Generative AI Platforms
- API Integration
- CLI & Developer Tools
- Enterprise Architecture
- Security & Compliance
- Technical Evaluation & Benchmarking
- Proof of Concept (POC) Development
- AI Platform Assessment
If you are passionate about evaluating cutting-edge AI technologies and driving enterprise AI adoption through technical excellence, we encourage you to apply.