What are the responsibilities and job description for the Exciting Opportunity for HPC Cluster & Scheduler Management consultant in Fremont, CA/ Tualatin, OR position at Noblesoft Technologies?

Hi

Role- HPC consultant

Location- Fremont, CA/ Tualatin, OR

HPC Cluster & Scheduler Management

Design, configure, tune, and optimize SLURM partitions, queues, QoS, and scheduling policies to maximize cluster utilization and workload efficiency.
Perform in-depth analysis of job scheduling behavior, bottlenecks, and resource contention.
Troubleshoot job failures, performance degradation, and scheduler-related issues in production HPC environments.
Implement fair-share, backfill, reservations, and policy-driven scheduling as required.

Storage Benchmarking & Procurement Support

Lead HPC storage performance benchmarking using industry-standard tools (e.g., IOR, FIO, MDTest, IOzone).
Analyze I/O patterns of HPC workloads and map them to appropriate storage architectures (parallel file systems, NVMe, Lustre, Spectrum Scale, etc.).
Provide technical input for storage selection and procurement, including performance expectations, sizing, and cost-performance tradeoffs.
Collaborate with vendors and internal teams during POCs and performance validation exercises.

HPC Application Build & Optimization

Build, install, configure, and maintain HPC applications, compilers, libraries, and scientific software stacks.
Optimize application performance using MPI, OpenMP, GPU acceleration (where applicable), and tuned math libraries.
Support multiple compiler toolchains (GCC, Intel, LLVM, NVIDIA HPC SDK, etc.).
Implement and manage environment modules (Lmod) or similar software management frameworks.

System Performance & Operations

Conduct system-level performance tuning across compute, memory, network, and storage layers.
Diagnose node-level issues involving CPU, GPU, interconnects (InfiniBand/Ethernet), and OS configurations.
Create operational runbooks, performance baselines, and troubleshooting documentation.
Support cluster upgrades, expansions, and hardware refresh activities.

Collaboration & Delivery

Work closely with application owners, researchers, and infrastructure teams to meet aggressive delivery timelines.
Translate workload requirements into practical HPC configurations and optimizations.
Provide clear technical guidance and recommendations to leadership and stakeholders.

Required Skills & Experience

Core HPC Skills

Technical Proficiency

Nice to Have

Apply for this job

Receive alerts for other Exciting Opportunity for HPC Cluster & Scheduler Management consultant in Fremont, CA/ Tualatin, OR job openings

Exciting Opportunity for HPC Cluster & Scheduler Management consultant in Fremont, CA/ Tualatin, OR