What are the responsibilities and job description for the GPU Software Engineer/GPU Architect position at Triune Infomatics Inc?
Role: GPU Software Engineer/GPU Architect
Location: San Jose, CA
Duration: Long-term >> ongoing contract
Overview: We're looking for a strong GPU Software Engineer/GPU Architect to join a highimpact engineering team working on nextgeneration AI, GPU, and semiconductor technologies. This role focuses on GPU kernel development, memory architecture, and integration with modern inference systems such as vLLM and SGLang. You'll work onsite in San Jose, collaborating closely with a team of engineers building highperformance GPUaccelerated systems.
- Develop and optimize CUDA/ROCm kernels for AI workloads
- Work with HBM, memory hierarchy, thread scheduling, and P2P communication
- Integrate GPU kernels with vLLM, SGLang, and other inference servers
- Build highperformance components in C and Python
- Support AI frameworks such as PyTorch and TensorFlow
- Optimize multiGPU scaling, KVcache, and attention kernels
- Profile and debug GPU workloads using Nsight, rocprof, etc.
- Collaborate with crossfunctional GPU, AI, and semiconductor teams
- Strong experience with CUDA, ROCm/HIP, OpenCL, or MPI
- Deep understanding of GPU architecture, HBM, memory models, and thread hierarchies
- Handson experience with AMD/NVIDIA GPU software stacks
- Expertlevel C and Python
- Experience with PyTorch or TensorFlow
- Experience with vLLM, SGLang, or similar inference systems
- RDMA, RoCE, InfiniBand, or Infinity Fabric
- Distributed inference/training or HPC experience
- Semiconductor or hardwareadjacent experience
Salary : $100