What are the responsibilities and job description for the Software Engineer in Test (SDET), AI Infrastructure position at SB Telecom America Corp.?
About Softbank: Softbank is making significant investments in infrastructure for AI. Softbank Corp. has recently established a new US center in Silicon Valley, focused on infrastructure software for AI and AI foundations for mobile networks. Our goals are to challenge the norms and create products making use of our SOTA infrastructure (like Nvidia GB200, MGX and DGX Grace & Hopper platforms) and cloud-native software. These products are geared towards centralized AI data centers as well as distributed AI Radio Access Network (AI RAN) data centers. We are looking for experienced practitioners who are inspired to bring innovation and build transformative products.
Minimum Qualifications:
- Bachelor's degree in Computer Science, Electrical Engineering, or related field.
- 3 years in software, hardware, engineering, including platforms and distributed systems.
- Experience working in systems & systems SW, Cloud and Kubernetes.
- Experience with production-testing and automation of Kubernetes deployments.
Preferred Qualifications:
- Master's or similar qualification in a relevant field.
- Experience with scalable test and automation infrastructure to productionize workloads.
- Experience with GPU platforms (e.g., Nvidia DGX, H100) and high-performance computing environments.
- Experience triaging customer bugs, prioritizing, and resolving issues in production.
- Familiarity with AI developer frameworks, tools, and automation systems.
Role: Be a key member of the infrastructure team responsible for building foundational software on top of GPU systems supporting AI workloads (training, fine-tuning and serving). Contribute to developing the test-automation infrastructure for Kubernetes and GPU systems. Innovate end-end systems software testing and automation for productionization velocity. As a Software Engineer in Test (SDET), AI Infrastructure for test-automation, you will work with Staff Engineers, product management and program management to drive execution towards commercialization.
Responsibilities:
- Contribute to building test-automation infrastructure for Kubernetes on large-scale GPU clusters.
- Help develop detailed test plans for different milestones and operationalize them in test-automation infrastructure.
- Own and conduct end-end system, scale and stress testing.
- Working together with SW leads and Technical Program Manager, qualify the releases.
- Attract and help build downstream production engineering talent.
- Role model and foster a culture of humility and innovation for product delivery.
Salary: The base salary for this position ranges from ($120,000-$180,000), with additional attractive biannual bonus, benefits and opportunity to work with a great team in downtown Sunnyvale, CA.
Salary : $120,000 - $180,000