What are the responsibilities and job description for the Senior Solution Architect – AI / GPU Cloud position at GMI Cloud?
About GMI Cloud
GMI Cloud is a fast-growing AI infrastructure company backed by Headline VC and one of only six cloud providers worldwide to earn NVIDIA’s prestigious Reference Platform Cloud Partner designation .
We are operating hundreds of megawatts of AI-ready data center capacity across North America and a growing AI Factory footprint in Asia, delivering a full spectrum of services from GPU compute service to AI model inference API solutions. As an NVIDIA six global Reference Platform Cloud Partner, our infrastructure meets the highest standards for performance, security, and scalability in AI deployments.
We empower AI startups and enterprises to “build AI without limits,” providing everything they need to prototype, train, and deploy AI models quickly and reliably.
Role Overview
As a Solution Architect, you will be the primary technical interface for our enterprise and hyperscaler accounts.
You will design GPU-cloud and AI infrastructure solutions, lead PoCs and benchmarks, guide customers through deployment, and partner closely with internal engineering, infra, and operations teams to ensure successful delivery.
This role is ideal for someone who understands large-scale AI/ML/HPC workloads, enjoys working directly with customers, and wants to shape the future of AI infrastructure.
Key Responsibilities
Customer Engagement & Technical Leadership
- Serve as the primary technical point-of-contact for enterprise and hyperscaler customers.
- Deeply understand customer AI/ML/HPC workloads, scaling requirements, and deployment models.
- Architect GPU clusters, storage, networking, and orchestration solutions tailored to customer needs.
Solution Design & PoC Execution
- Lead Proof-of-Concepts, benchmarks, and workshops demonstrating performance, reliability, and scalability.
- Produce technical proposals, architecture diagrams, capacity plans, and cost/performance recommendations
- Translate complex technical issues into clear actions for both engineering and business stakeholders.
Deployment & Enablement
- Guide customers through onboarding, cluster setup, performance tuning, and scaling.
- Partner with internal Infra, DC Ops, and Engineering teams to ensure smooth delivery and implementation.
- Identify optimization opportunities in customer workloads (GPU utilization, networking, scheduling, cost).
Ongoing Support & Relationship Building
- Act as a trusted advisor on GPU/AI infrastructure best practices, roadmap, and long-term planning.
- Maintain regular technical check-ins, capacity reviews, and performance reviews with customers.
- Gather customer feedback and collaborate with product/engineering to improve our platform.
Required Qualifications
Technical Background
- 5–10 years in cloud infrastructure, GPU cloud, HPC, AI/ML infrastructure, or data center engineering.
- Strong understanding of:
- Distributed training & inference architectures
- Kubernetes, Slurm, or other cluster/orchestration systems
- NVIDIA GPU stack (H100/H200/B200/GB200 or similar)
- InfiniBand / high-speed networking
- Storage architectures for AI workloads
Customer-Facing Skills
- Experience working directly with enterprise or hyperscaler technical teams.
- Ability to simplify complex infra concepts for both technical and non-technical audiences.
- Strong communication, solution-design, and project coordination skills.
Soft Skills
- Self-starter, ownership mindset, excellent follow-through.
- Comfortable working in a fast-moving, high-growth environment.
- Strong problem-solving and “architect advisor” mentality.
Preferred Qualifications (Nice to Have)
- Hands-on with large-scale GPU deployments (multi-node, multi-cluster).
- Exposure to hyperscaler capacity planning or AI infrastructure procurement teams.
- Experience with multi-region or global GPU deployments (US APAC/Taiwan).
Why Join GMI Cloud
- Work directly with some of the world’s most advanced AI organizations.
- Architect and deliver multi-MW GPU clusters at global scale.
- Influence product roadmap and partner closely with NVIDIA and top-tier data center providers.
- High-impact role with significant ownership and career growth.