What are the responsibilities and job description for the Sr. Technical Account Manager (TAM) position at GMI Cloud?
About US
GMI Cloud is a fast-growing AI infrastructure company backed by Headline VC and one of only six cloud providers worldwide to earn NVIDIA’s prestigious Reference Platform Cloud Partner designation . We operate 8 of our own GPU clusters across the U.S. and Asia, delivering a full spectrum of services from GPU compute service to AI model inference API solutions. As an NVIDIA Reference Platform Cloud Partner, our infrastructure meets the highest standards for performance, security, and scalability in AI deployments. We empower AI startups and enterprises to “build AI without limits,” providing everything they need to prototype, train, and deploy AI models quickly and reliably.
About this role
We’re seeking a Sr. Technical Account Manager (TAM) with a strong customer-first approach, technical expertise, and a passion for solving complex challenges. You will play a critical role in ensuring customers have an outstanding experience with GPU Cloud by addressing their needs proactively, resolving technical challenges promptly, and advocating for their success. If you thrive in fast-paced environments, excel in building strong customer relationships, and are driven to deliver exceptional service, we’d love to hear from you.
Key Responsibilities
Building Strong Customer Relationships
• Serve as the primary technical contact for customers, addressing inquiries and issues promptly and effectively.
• Advocate for customers within GMI Cloud, ensuring their needs influence product roadmaps and service enhancements.
• Conduct workshops, training sessions, and tailored consultations to help customers maximize GPU Cloud utilization.
Proactive Problem-Solving & Technical Guidance
• Monitor customer environments to identify potential risks and performance bottlenecks, implementing preventative measures.
• Guide customers in designing and optimizing GPU-based system architectures, ensuring performance, scalability, and stability.
• Support cloud migrations by leveraging expertise in high-performance computing, AI/ML workloads, and data processing.
Cloud Optimization & Operational Excellence
• Conduct operational reviews to assess resource utilization, performance improvements, and cost optimization opportunities.
• Collaborate with customers to enhance business continuity, disaster recovery, and system monitoring capabilities.
• Drive continuous improvements, empowering customers to independently maintain and scale their cloud environments.
Required Skills
- AI Infrastructure: Understanding of GPU servers, storage (Ceph, NVMe, NFS), and high-speed networking (InfiniBand, RoCE).
- Kubernetes (K8s): Understanding of container orchestration, scheduling, and networking.
- AI/LLM: Familiarity with large language model training and inference workflows.
- Frameworks: Working knowledge of SGLang, vLLM, Slurm, and Ray (Anyscale) or equivalent distributed computing tools.
- Communication: Clear and confident in technical discussions with customers and internal teams.
Preferred Qualifications
- Certified Kubernetes Administrator (CKA) certification is preferred.
- Hands-on experience in HPC, MLOps, or large-scale AI infrastructure environments.
- Experience managing or scaling Ray clusters for distributed inference or data processing.
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related technical field.
- Prior experience supporting enterprise or hyperscale AI workloads is a plus.