What are the responsibilities and job description for the AI Engineer – Voice & Conversational Systems (W2 Role) position at GBIT (Global Bridge InfoTech Inc)?
Role: AI Engineer – Voice & Conversational Systems
Location: Plano, TX - Hybrid
Position Overview:
We are seeking an experienced AI Engineer to design, build, and deploy next-generation conversational AI and real-time voice agents. In this role, you will bridge the gap between advanced Large Language Models (LLMs) and real-world telecommunication systems. You will be responsible for building ultra-low-latency voice pipelines, integrating interactive voice response (IVR) systems, implementing robust agent tool-calling frameworks via MCP, and ensuring system safety through rigorous evaluation and guardrails.
Key Responsibilities:
Voice Agent Development: Design, optimize, and deploy end-to-end voice agents and real-time conversational pipelines, ensuring minimal latency and high contextual accuracy.
IVR & Telephony Integration: Connect AI voice agents seamlessly with Contact Center IVR systems to automate customer interactions.
Context & Tool Orchestration: Utilize MCP (Model Context Protocol) and FastMCP frameworks to give AI models structured access to secure data sources and enterprise tools.
Model Selection & Optimization: Architect solutions leveraging state-of-the-art LLMs, including OpenAI GPT models and AWS Nova models via AWS Bedrock.
Speech Processing Pipelines: Implement and fine-tune Speech-to-Text (STT) and Text-to-Speech (TTS) pipelines using DeepGram and ElevenLabs.
System Evaluation & Safety: Establish evaluation frameworks (Evals) to measure agent performance and implement Guardrails to ensure deterministic, safe, and compliant model outputs.
Cloud Infrastructure: Understanding of scalable AI microservices using Python, API Gateway, and AWS S3 storage.
Required Technical Skills
Core AI & Frameworks:
- Strong proficiency in Python and standard AI/ML frameworks.
- Hands-on experience with MCP (Model Context Protocol) and FastMCP for context standardizing.
Large Language Models (LLMs):
- Experience deploying and prompting *OpenAI GPT models* and AWS Nova models.
- Deep understanding of *AWS Bedrock* and orchestrating multi-step workflows with *AWS Strands*.
Voice & Audio Tech:
Speech-to-Text (STT): Production experience with *DeepGram* or similar real-time streaming audio tools.
Text-to-Speech (TTS): Experience generating natural, low-latency speech via *ElevenLabs*.
Production & Infrastructure:
- Familiarity integrating AI pipelines into traditional *Contact Center IVR systems*.
- Experience building robust REST/WebSocket APIs using AWS *API Gateway* and managing data persistence in *S3 Buckets*.
AI Quality & Safety:
- Proven experience building *Evals* to benchmark model accuracy, latency, and hallucination rates.
- Experience configuring *Guardrails* (e.g., input/output filtering, PII masking, safety alignment).
Preferred Qualifications:
- Background in computational linguistics, audio signal processing, or real-time streaming protocols (WebSockets, WebRTC).
- Experience tuning prompts specifically for voice/conversational contexts (where brevity and conversational pacing matter).
- Familiarity with agile software development and CI/CD pipelines for AI workloads.