What are the responsibilities and job description for the Senior Compiler Backend Engineer position at Oxmiq Labs?

About OXMIQ Labs

OXMIQ Labs is re‑architecting the GPU stack “from atoms to agents”—building a licensable GPU hardware and software platform for next‑generation AI, graphics, and multimodal workloads. Founded by GPU architect Raja Koduri, OXMIQ develops GPU IP cores and software, not consumer chips, with a software‑first model and IP licensing business.

At the heart of the hardware roadmap is OxCore™, a RISC‑V–based GPU IP core that integrates scalar, vector, and tensor engines in a modular architecture, designed to scale from tiny edge devices to zettascale data‑center deployments via the OxQuilt™ chiplet/SoC builder. OxCore targets near‑ and in‑memory compute, supports nano‑agents, and aims for SIMD/CUDA compatibility and native Python acceleration.

On the software side, OXMIQ is building:

OXPython – runs Python‑based CUDA applications unmodified on non‑NVIDIA hardware
Capsule – a GPU container / deployment layer for heterogeneous systems

This role sits right at that hardware–software boundary.

The role

We’re looking for a Senior Compiler Backend Engineer to own the compiler backend for OxCore hardware IP.

You’ll design and implement the lowering pipeline that maps high‑level IR and Python/CUDA‑style workloads—flowing through systems like OXPython and Capsule—onto OxCore’s scalar, vector, tensor, and near‑/in‑memory engines across many OxQuilt configurations.

This is a foundational role: your work will directly shape how developers target OxCore from Python and other high‑level languages and how customers squeeze performance out of their OxCore‑based SoCs.

What you’ll do

Own the OxCore compiler backend

Design and implement the OxCore codegen backend (likely on top of LLVM/MLIR or similar) from high‑level IR down to OxCore’s instruction set / micro‑ops.
Define and evolve OxCore‑specific IR dialects, calling conventions, and ABI details across scalar, vector, and tensor engines.
Implement lowering passes that map Python/CUDA‑like kernels and ML operators to OxCore execution units and memory hierarchy.

Architect performance‑critical optimizations for OxCore

Build OxCore‑aware optimization passes:
instruction selection and scheduling across heterogeneous units
register allocation tuned for OxCore’s register files
memory‑access shaping for near‑/in‑memory compute, coalescing, tiling, and locality
warp/SIMD‑style utilization given OxCore’s SIMT/SIMD/CUDA‑compatible execution model
Develop cost models and auto‑tuning hooks that understand different OxCore/OxQuilt configurations (ratios of compute, memory, and interconnect).

Integrate tightly with Capsule, OXPython, and runtime

Collaborate with the OXPython team to ensure Python‑based CUDA workloads lower efficiently onto OxCore, preserving semantics while exploiting OxCore features.
Work with Capsule/runtime engineers on:
kernel launch strategies and stream/queue design
heterogeneous dispatch across OxCore and other accelerators
profiling hooks and debug interfaces.

Partner with hardware and tools teams

Work closely with OxCore architecture and OxQuilt teams to:
capture hardware capabilities and constraints into compiler models
co‑design micro‑architectural features that unlock compiler‑driven performance.
Use pre‑silicon models, simulators, and FPGA/emulation platforms to validate correctness and drive performance prior to customer silicon.

Mentor & lead

Provide technical leadership for OxCore backend architecture, coding standards, and design reviews.
Mentor other engineers on compiler/backend internals, GPU/accelerator performance, and RISC‑V nuances.

You might be a fit if you

Have 7 years of experience in compiler backend / codegen / low‑level performance engineering (title flexible for exceptional candidates).
Have shipped or led substantial work on compiler backends (LLVM, GCC, MLIR, custom) targeting GPUs or accelerators.
Understand deeply:
SSA/IR design, CFGs, dataflow
instruction selection & scheduling
register allocation strategies
loop transforms, tiling, vectorization.
Have strong GPU/accelerator architecture intuition:
SIMD/SIMT, warps/wavefronts, occupancy
memory hierarchies (local/shared, HBM/DRAM, scratchpad)
throughput vs latency trade‑offs.
Are fluent in modern C (and/or Rust) for large systems codebases.
Have meaningful exposure to RISC‑V or other ISA‑level work (writing backends, intrinsics, or hand‑tuned assembly is a plus).
Know how to profile and optimize: you’ve used tools like Nsight, perf, VTune, ROCm tools, or custom profilers to chase down performance wins.
Are comfortable operating in pre‑silicon environments (simulators, emulators, performance modeling).
Enjoy working across boundaries: hardware, compilers, runtimes, and ML/graphics workloads.

Nice to have

These are bonuses, not hard requirements:

Experience with ML compilers / DSLs (e.g., MLIR, TVM, XLA, Triton, Halide, IREE).
Background in GPU IP or licensable core design flows (ARM, IP providers, or custom accelerators).
Familiarity with Python‑first or CUDA‑centric toolchains, and porting CUDA workloads to new backends.
Experience with chiplet / heterogeneous SoC design constraints or HW/SW co‑design.
Contributions to open‑source compilers or runtimes.
Prior work in AI, graphics, or multimodal workloads (rendering, path tracing, transformer models, etc.).

What success looks like (first 6–12 months)

A robust OxCore backend capable of compiling and optimizing a core set of workloads (e.g., representative AI/graphics kernels) against current OxCore models.
Demonstrated end‑to‑end speedups versus generic GPU backends for selected workloads, using OxCore features (near‑/in‑memory compute, scalar/vector/tensor orchestration).
Tight integration with OXPython and Capsule, with real pipelines running on partner hardware/accelerators.
A clear roadmap for expanding OxCore ISA coverage, optimization passes, and OxQuilt‑aware codegen.

How we work

Small, senior team with high ownership.
Software‑first, but deeply aligned with hardware IP and customer silicon.
Pragmatic: we care about performance, reliability, and developer experience over buzzwords.

Apply for this job

Receive alerts for other Senior Compiler Backend Engineer job openings

Senior Compiler Backend Engineer

What are the responsibilities and job description for the Senior Compiler Backend Engineer position at Oxmiq Labs?

What is the career path for a Senior Compiler Backend Engineer?

Job openings at Oxmiq Labs

Not the job you're looking for? Here are some other Senior Compiler Backend Engineer jobs in the Campbell, CA area that may be a better fit.

We don't have any other Senior Compiler Backend Engineer jobs in the Campbell, CA area right now.

AI Assistant is available now!