What are the responsibilities and job description for the Senior Agentic (AI) Engineer position at ChatGPT Jobs?
Job Description
Job Description: Senior Agentic AI Engineer
Job Title: Senior Agentic AI Engineer
Company Information
Company: Worth AI
Location: Miami, FL (Remote option available)
Work Model: Remote / Hybrid (Note: All remote hires must travel to Orlando, Florida at least twice per year for Town Halls and orientation).
Job Description
Worth AI is hiring a Senior Agentic AI Engineer to design and ship production agent systems that automate KYB, underwriting, and risk decisions on regulated financial data. You will own agents' end-to-end architecture, retrieval, tools, evals, and production deployment, partnering closely with the Chief AI Officer, applied scientists, and platform teams.
Key Responsibilities
Job Description: Senior Agentic AI Engineer
Job Title: Senior Agentic AI Engineer
Company Information
Company: Worth AI
Location: Miami, FL (Remote option available)
Work Model: Remote / Hybrid (Note: All remote hires must travel to Orlando, Florida at least twice per year for Town Halls and orientation).
Job Description
Worth AI is hiring a Senior Agentic AI Engineer to design and ship production agent systems that automate KYB, underwriting, and risk decisions on regulated financial data. You will own agents' end-to-end architecture, retrieval, tools, evals, and production deployment, partnering closely with the Chief AI Officer, applied scientists, and platform teams.
Key Responsibilities
- Design and ship multi-step agentic systems (planner/executor, tool-using, multi-agent, human-in-the-loop) for onboarding, underwriting, case review, and continuous monitoring.
- Architect agent graphs in LangGraph (or comparable frameworks like CrewAI, AutoGen, Claude Agent SDK) with explicit state, durable execution, retries, and safe fallbacks.
- Build the retrieval layer powering agents, including chunking, hybrid search, reranking, and grounded citation.
- Own the evaluation stack: golden sets, offline regression suites, LLM-as-judge, online A/B and shadow evals, and red-teaming for jailbreaks, prompt injection, and PII leakage.
- Expose agents to production systems via well-typed tools and MCP servers; treat tool surface area as a product.
- Drive production MLOps: deployment, versioning, traffic shaping, cost/latency budgets, tracing, and on-call playbooks for agent incidents.
- Partner with security and compliance to ensure agents operate within SOC 2, GDPR, CCPA, and fair-lending postures, ensuring auditability and explainability are built-in.
- Mentor engineers on agent patterns, prompt hygiene, eval discipline, and LLM failure modes.
- Languages: Python, Node.js, TypeScript
- Agent/LLM Frameworks: LangGraph, LangChain, Claude Agent SDK, MCP, OpenAI SDK
- Models: Anthropic Claude, OpenAI, open-weight models
- Retrieval & Data: PostgreSQL, pgvector, OpenSearch, Kafka, Redshift, Redis
- Infrastructure: AWS, Kubernetes (EKS), ArgoCD, Terraform
- Evals & Observability: LangSmith, Langfuse, Braintrust-style tooling, DataDog
- 5 years of software engineering experience, with 2 years building production LLM or agentic systems (excluding notebooks or demos).
- Hands-on experience with modern agent frameworks; LangGraph strongly preferred. Proven track record of shipping agents that run, fail gracefully, and recover.
- Strong RAG fundamentals: chunking, embeddings, hybrid retrieval, reranking, grounding, and the judgment to know when RAG is not the right answer.
- Real evaluative experience: utilizing golden sets and offline/online evaluations to make ship/no-ship decisions.
- Production MLOps fluency: experience deploying LLM workloads under real latency, cost, and reliability constraints.
- Strong Python skills; comfortable with TypeScript/Node.js.
- Solid systems engineering instincts: APIs, async patterns, queues, databases, and distributed system failure modes.
- Strong communication skills; ability to thrive in ambiguous, fast-moving environments.
- Experience in fintech, lending, payments, KYB/KYC, fraud, or AML.
- Experience building MCP servers or other structured tool interfaces for LLMs.
- Background in classical ML (ranking, scoring, calibration).
- Experience designing explainable/auditable AI workflows for regulated environments.
- Open-source contributions to agent frameworks, eval tooling, or retrieval libraries.
- AWS depth (EKS, MSK, RDS, S3, Lambda) and Infrastructure as Code (IaC) with Terraform.
- Agent Quality: Improvements in task success rate, grounding accuracy, and hallucination reduction.
- Production Reliability: Meeting SLOs for latency (P90/P99), tool-call success, and cost per task.
- Velocity: Rapid transition from prototype to production while maintaining evals and guardrails.
- Risk Posture: Zero material incidents related to prompt injection, PII leakage, or unsafe tool use.
- Force Multiplier: Creation of patterns, tools, and scaffolding adopted by the wider engineering team.
- Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k, IRA)
- Life Insurance
- Flexible Paid Time Off
- 9 Paid Holidays
- Family Leave
- Remote/Hybrid options (with Orlando-based specific perks: Free Food & Snacks, Wellness Resources)