What are the responsibilities and job description for the Senior Conversational AI / Voice Integration Engineer position at OViiE Ai Inc?
OViiE AI Inc. | Orlando, FL (Hybrid/Remote)
About OViiE AIOViiE AI is building the world’s first language infrastructure layer for global communication. Our flagship platform, STELLA, enables real-time, voice-to-voice translation across hospitality, travel, and enterprise environments—removing language barriers at the moment they matter most.
We are not building a feature.
We are building infrastructure.
With upcoming deployments aligned to FIFA World Cup 2026 host cities, OViiE AI is scaling rapidly and seeking engineers who want to build real-world, high-impact AI systems used by millions globally.
The RoleWe are looking for a Senior Conversational AI / Voice Integration Engineer to lead the development of our real-time voice orchestration layer, integrating:
- ElevenLabs conversational agents
- OViiE’s proprietary Virtual Translation Rooms (VTR)
- STELLA’s enterprise SaaS platform
This role is critical in enabling live, multilingual, voice-to-voice communication across multiple participants, systems, and environments.
You will own the end-to-end conversational pipeline—from audio input to translated output to workflow automation.
What You’ll Do- Architect and build real-time voice pipelines for conversational AI systems
- Integrate ElevenLabs agents (WebSocket-based streaming) with STELLA and VTR
- Design and implement tool-calling / function orchestration to trigger translation and workflows
- Build APIs and services that handle:
- audio streaming
- transcript processing
- translation routing
- session state management
- Develop multi-participant conversation logic (speaker routing, language detection, turn-taking)
- Connect conversational flows to enterprise systems (PMS, CRM, service workflows)
- Optimize for low latency, high concurrency, and reliability at scale
- Collaborate closely with product, ML, and design teams to deliver seamless user experiences
- 5 years in software engineering, with focus on real-time systems or conversational AI
- Strong experience with:
- WebSockets / WebRTC / streaming architectures
- Node.js or Python backend development
- REST APIs and event-driven systems
- Experience integrating or working with:
- LLMs or conversational AI frameworks
- speech systems (ASR/TTS)
- voice APIs (e.g., ElevenLabs, Twilio, etc.)
- Understanding of:
- multi-turn dialogue systems
- tool/function calling in LLM environments
- latency optimization in real-time applications
- Experience with multilingual systems or translation pipelines
- Familiarity with NVIDIA Riva / NeMo / speech AI frameworks
- Experience building multi-user or session-based platforms
- Background in hospitality, travel tech, or real-time communication platforms