What are the responsibilities and job description for the AI Infrastructure & Experience Engineer position at OSI Engineering, Inc.?
A global consumer device company based in Mountain View, CA is looking for AI Infrastructure & Experience Engineer to join their team!
Key Responsibilities
- Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.
- Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low cost GPU compute.
- Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.
- Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.
- Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.
Qualifications:
- 3 years of relevant industry experience required
- Recent experience in model optimization required
- Proven experience with NVIDIA eco-systems and ARM64 architecture.
- Advanced proficiency in C , Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.
- Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).
- Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.
- Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.
- Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.
- Degree in Computer Science, Machine Learning or Artificial Intelligence Specialization preferred, but not required.
Type: Contract
Duration: 4 months with extension
Work Location: Mountain View, CA (onsite)
Pay range: $ 64.00 - $ 79.00 (DOE)
Salary : $64 - $79