What are the responsibilities and job description for the Sr. AI Runtime Engineer position at I Machines, Inc.?

About Our Company

We’re a fast-paced, fabless semiconductor startup redefining the boundaries of AI through cutting-edge, scalable AI-infused compute architecture. Our mission is to deliver scalable, efficient, and intelligent silicon solutions for the next generation of edge AI, robotics, autonomous systems, and mobile devices. Our leadership team brings together decades of experience in semiconductor innovation, spanning chip architecture, system design, and global business operations. The team includes pioneers behind several generations of groundbreaking compute architectures, experts in software-hardware co-design, SoC and AI.

Job description:

We are seeking experienced candidates for the position of Senior AI Runtime Engineer to ensure efficient, fast, and reliable execution of AI models in production environments on modern RISC-V processors with advanced AI features.

Key Responsibilities:

Runtime Development: Architect, develop, and maintain the core runtime infrastructure that powers distributed inference for large-scale AI workloads
Model Optimization: Implement advanced inference techniques like quantization, pruning, model parallelism (data/tensor/expert), and KV cache management
High-Performance Inference Engines: Architecting and building inference runtimes to execute models efficiently on specific hardware, including NPUs, and CPUs
Hardware-Software Co-design: Work closely with AI researchers and hardware teams to co-design and optimize model architectures for specific AI hardware
Production Deployment: Building robust pipelines for deploying models into production environments, including developing tools for performance profiling, benchmarking, and monitoring of production-grade systems

Required Qualifications & Skills:

BS in Computer Science (MS or PhD preferred)

At least 5 years of experience in AI modeling, deploying production-ready AI applications, managing data pipelines
Deep understanding of the entire inference workflow and AI frameworks (PyTorch, ONNX, JAX, TensorFlow, Llama.cpp, IREE and their underlying architecture)
Expert-level coding skills in Python and C/C
Experience with AI compilers and especially in-depth understanding of end-to-end deployment using IREE compiler and IREE runtime
Deep understanding of how to build, scale, and evaluate AI systems, including handling probabilistic outputs and quantifying model success
Understanding of CPU architecture and/or parallel computer architecture, basic understanding of RISC-V ISA
Strong critical thinking, communication, and collaboration skills
Demonstrated ability to analyze and troubleshooting issues facing production testing and drive improvements to increase test suite & engineering efficiency and effectiveness
Ability to learn new technologies and apply the knowledge quickly
Ability to meet project milestones and deadlines

Benefits and Perks

At I Machines, Inc., we offer competitive salaries and a comprehensive benefits package, including:

Health, dental, and vision insurance
Retirement savings plans
Paid time off and holidays
Professional development opportunities
Flexible Schedule

Apply for this job

Receive alerts for other Sr. AI Runtime Engineer job openings

Sr. AI Runtime Engineer

What are the responsibilities and job description for the Sr. AI Runtime Engineer position at I Machines, Inc.?

What is the career path for a Sr. AI Runtime Engineer?

Job openings at I Machines, Inc.

Not the job you're looking for? Here are some other Sr. AI Runtime Engineer jobs in the Santa Clara, CA area that may be a better fit.

We don't have any other Sr. AI Runtime Engineer jobs in the Santa Clara, CA area right now.

AI Assistant is available now!