What are the responsibilities and job description for the HPC Solutions Architect position at GTN Technical Staffing?
HPC Solutions Architect
Location: Dallas, TX (Hybrid)
Type: Direct Hire
• Competitive base salary performance bonus
• 100% company-paid benefits
• Relocation available
Overview
We are seeking an HPC Solutions Architect to lead the design, integration, and delivery of advanced compute platforms supporting HPC, AI/ML workloads, and next-generation CaaS / GPUaaS environments.
This is a highly technical, customer-facing role focused on building scalable, high-performance architectures across compute, storage, networking, Kubernetes, and security domains. You will operate across the full solution lifecycle—from early-stage discovery and workload analysis through architecture design, proof-of-concept, deployment, and ongoing optimization.
The ideal candidate brings deep expertise in HPC and GPU-accelerated environments, with the ability to translate complex customer requirements into production-ready, scalable platform solutions in rapidly evolving AI infrastructure ecosystems.
Key Responsibilities
Customer Engagement & Technical Strategy
- Partner directly with customers to understand HPC, AI/ML, and GPUaaS / CaaS platform requirements, performance goals, and scaling challenges
- Lead technical discovery sessions to evaluate workloads, bottlenecks, and architecture trade-offs
- Serve as a trusted advisor across the full solution lifecycle
Solution Architecture & Platform Design
- Design end-to-end architectures across compute (CPU/GPU), storage, networking, orchestration, and security
- Architect solutions supporting GPU-as-a-Service (GPUaaS) and Container-as-a-Service (CaaS) delivery models
- Develop reference architectures, design blueprints, and integration frameworks
Performance Optimization & Workload Engineering
- Lead proof-of-concept, benchmarking, and validation efforts for HPC and AI workloads
- Perform workload profiling, system tuning, and performance optimization across distributed environments
- Optimize for scalability, efficiency, and reliability in GPU-accelerated platforms
Implementation & Solution Delivery
- Provide hands-on guidance during deployment to ensure successful integration into customer environments
- Partner with engineering, product, and operations teams to deliver end-to-end solutions
- Support deployment through production readiness and ongoing optimization
Cross-Functional Collaboration
- Collaborate with internal engineering and product teams to influence platform capabilities and roadmap
- Build relationships across GPU, networking, and storage vendor ecosystems
- Contribute to the evolution of HPC and AI platform offerings
Innovation & Technical Leadership
- Stay current on emerging technologies across GPUs, accelerators, interconnects, and orchestration frameworks
- Represent the organization in technical workshops, architecture reviews, and customer engagements
- Contribute to reusable frameworks, best practices, and reference architectures
Required Experience
- Proven experience in HPC solution architecture, distributed systems design, or large-scale AI infrastructure environments
- Strong expertise across:
- GPU and CPU architectures (NVIDIA ecosystem, CUDA)
- HPC and container-based schedulers (Slurm, Kubernetes)
- High-performance networking (InfiniBand, RDMA, RoCE)
- Distributed storage systems (Lustre, GPFS, Ceph, VAST)
- Kubernetes and container orchestration for HPC / AI workloads
- Security integration (identity, encryption, compliance)
- Experience designing or supporting CaaS and/or GPUaaS platforms or similar multi-tenant compute environments
- Strong Linux systems expertise including performance tuning and system-level optimization
- Ability to translate complex customer requirements into scalable architecture and implementation plans
- Strong communication skills with experience leading workshops, technical reviews, and customer engagements
- Experience working cross-functionally with engineering, product, and operations teams
- Ability to present complex technical concepts to both technical and executive audiences
Preferred Experience
- Experience delivering HPC or AI/ML workloads from design through deployment and optimization
- Familiarity with containerized HPC environments (Kubernetes, Singularity, etc.)
- Experience with automation, infrastructure-as-code, and platform engineering practices
- Background in proof-of-concept delivery, benchmarking, and workload migration
- Exposure to next-generation GPU architectures and high-speed interconnects
- Bachelor’s or Master’s degree in Computer Science, Engineering, Physics, or related field
- Relevant certifications (AWS, Azure, GCP, Kubernetes, Linux, networking)
Additional Requirements
- This position requires applicants to be currently authorized to work in the U.S. without employer sponsorship.
- We are unable to sponsor or take over sponsorship of employment visas at this time.
Salary : $175,000 - $275,000