What are the responsibilities and job description for the Sr. AI Runtime Engineer position at I Machines, Inc.?
About Our Company
We’re a fast-paced, fabless semiconductor startup redefining the boundaries of AI through cutting-edge, scalable AI-infused compute architecture. Our mission is to deliver scalable, efficient, and intelligent silicon solutions for the next generation of edge AI, robotics, autonomous systems, and mobile devices. Our leadership team brings together decades of experience in semiconductor innovation, spanning chip architecture, system design, and global business operations. The team includes pioneers behind several generations of groundbreaking compute architectures, experts in software-hardware co-design, SoC and AI.
Job description:
We are seeking experienced candidates for the position of Senior AI Runtime Engineer to ensure efficient, fast, and reliable execution of AI models in production environments on modern RISC-V processors with advanced AI features.
Key Responsibilities:
- Runtime Development: Architect, develop, and maintain the core runtime infrastructure that powers distributed inference for large-scale AI workloads
- Model Optimization: Implement advanced inference techniques like quantization, pruning, model parallelism (data/tensor/expert), and KV cache management
- High-Performance Inference Engines: Architecting and building inference runtimes to execute models efficiently on specific hardware, including NPUs, and CPUs
- Hardware-Software Co-design: Work closely with AI researchers and hardware teams to co-design and optimize model architectures for specific AI hardware
- Production Deployment: Building robust pipelines for deploying models into production environments, including developing tools for performance profiling, benchmarking, and monitoring of production-grade systems
Required Qualifications & Skills:
BS in Computer Science (MS or PhD preferred)
- At least 5 years of experience in AI modeling, deploying production-ready AI applications, managing data pipelines
- Deep understanding of the entire inference workflow and AI frameworks (PyTorch, ONNX, JAX, TensorFlow, Llama.cpp, IREE and their underlying architecture)
- Expert-level coding skills in Python and C/C
- Experience with AI compilers and especially in-depth understanding of end-to-end deployment using IREE compiler and IREE runtime
- Deep understanding of how to build, scale, and evaluate AI systems, including handling probabilistic outputs and quantifying model success
- Understanding of CPU architecture and/or parallel computer architecture, basic understanding of RISC-V ISA
- Strong critical thinking, communication, and collaboration skills
- Demonstrated ability to analyze and troubleshooting issues facing production testing and drive improvements to increase test suite & engineering efficiency and effectiveness
- Ability to learn new technologies and apply the knowledge quickly
- Ability to meet project milestones and deadlines
Benefits and Perks
At I Machines, Inc., we offer competitive salaries and a comprehensive benefits package, including:
- Health, dental, and vision insurance
- Retirement savings plans
- Paid time off and holidays
- Professional development opportunities
- Flexible Schedule