What are the responsibilities and job description for the CUDA Kernel Engineer position at Pragmatike?
Location: Cambridge, MA (Eastern Time / UTC -4) Relocation package available
Start date: ASAP
Languages: English (required)
About The Role
Pragmatike is hiring on behalf of a fast-growing AI startup recognized as a Top 10 GenAI company by GTM Capital, founded by MIT CSAIL researchers.
We are searching for a CUDA Kernel Engineer who has hands-on experience developing and optimizing NVIDIA CUDA kernels from scratch. You will work on the GPU performance layer powering large-scale, high-throughput AI systems used by Fortune 500 customers.
This role is ideal for someone who deeply understands NVIDIA GPU architecture, memory hierarchy, warp-level execution, and profiling workflowsnot someone coming from generic hardware, FPGA, or non-NVIDIA compute backgrounds. You will directly influence the GPU efficiency, throughput, and scalability of mission-critical AI systems.
What Youll Do
Start date: ASAP
Languages: English (required)
About The Role
Pragmatike is hiring on behalf of a fast-growing AI startup recognized as a Top 10 GenAI company by GTM Capital, founded by MIT CSAIL researchers.
We are searching for a CUDA Kernel Engineer who has hands-on experience developing and optimizing NVIDIA CUDA kernels from scratch. You will work on the GPU performance layer powering large-scale, high-throughput AI systems used by Fortune 500 customers.
This role is ideal for someone who deeply understands NVIDIA GPU architecture, memory hierarchy, warp-level execution, and profiling workflowsnot someone coming from generic hardware, FPGA, or non-NVIDIA compute backgrounds. You will directly influence the GPU efficiency, throughput, and scalability of mission-critical AI systems.
What Youll Do
- Design, implement, and optimize custom CUDA kernels for NVIDIA GPUs, with a focus on maximizing occupancy, memory throughput, and warp efficiency.
- Profile GPU workloads using tools such as Nsight Compute, Nsight Systems, nvprof, and CUDA‐MEMCHECK.
- Analyze and eliminate performance bottlenecks including warp divergence, uncoalesced memory access, register pressure, and PCIe transfer overhead.
- Improve GPU memory pipelines (global, shared, L2, texture memory) and ensure proper memory coalescing.
- Collaborate closely with AI systems, model acceleration, and backend distributed systems teams.
- Contribute to GPU architecture decisions, kernel libraries, and internal performance-engineering best practices.
- Proven track record building NVIDIA CUDA kernels from scratchnot just calling existing libraries.
- Strong ability to optimize kernels (tiling strategies, occupancy tuning, shared memory design, warp scheduling).
- Deep understanding of CUDA threads, warps, blocks, and grids, GPU memory hierarchy and memory coalescing, as well as warp divergence (how to detect, analyze, and mitigate it)
- Experience diagnosing PCIe bottlenecks and optimizing host-device transfers (pinned memory, streams, batching, overlap).
- Familiarity with C , CUDA runtime APIs, and GPU debugging/profiling tooling.
- Experience with multi-GPU or distributed GPU systems (NCCL, NVLink, MIG).
- Background in GPU acceleration for ML frameworks or HPC workloads.
- Knowledge of model inference optimization (TensorRT, CUDA Graphs, CUTLASS).
- Exposure to compiler-level optimization or PTX/SASS analysis.
- Startup experience or comfort working in fast-moving, ambiguous environments.
- Research pedigree: MIT CSAIL founders recognized for breakthrough AI and systems contributions.
- Customer impact: Deploy AI solutions powering Fortune 500 clients.
- Industry momentum: Lab alumni have led high-value acquisitions (MosaicML Databricks, Run:AI Nvidia, W&B CoreWeave).
- Funding & growth: Oversubscribed seed round, next funding in 2026.
- Career growth & influence: Lead AI initiatives, optimize pipelines, and directly impact production AI systems at scale.
- Culture & autonomy: Own critical systems while collaborating with world-class engineers.
- Aspirational impact: Solve GPU/AI performance challenges few engineers ever face.
- Competitive salary & equity options
- Sign-on bonus
- Health, Dental, and Vision
- 401k