What are the responsibilities and job description for the Senior Windows & SQL Server Administrator HPC / Analytics Platform position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, SLK America Inc., is seeking the following. Apply via Dice today!
Job Title
Senior Windows & SQL Server Administrator HPC / Analytics Platform
Job Summary
The role is responsible for owning and independently executing end-to-end operations of a Windows-based analytics and High-Performance Computing (HPC) platform with minimal supervision. The position focuses on maintaining platform stability, performance, security, and supporting migrations across onpremises and cloud environments.
Key Responsibilities
Platform Operations & HPC Administration
Job Title
Senior Windows & SQL Server Administrator HPC / Analytics Platform
Job Summary
The role is responsible for owning and independently executing end-to-end operations of a Windows-based analytics and High-Performance Computing (HPC) platform with minimal supervision. The position focuses on maintaining platform stability, performance, security, and supporting migrations across onpremises and cloud environments.
Key Responsibilities
Platform Operations & HPC Administration
- Operate and support a Windows-based analytics platform running on Microsoft SQL Server and Microsoft HPC.
- Administer Microsoft HPC environments, including head nodes, compute nodes, and scheduling services.
- Monitor and manage compute-intensive workloads, ensuring stable and timely execution.
- Troubleshoot job execution issues such as:
- Queue backlogs
- Stuck or failed jobs
- Abnormal CPU utilization on compute nodes
- Maintain overall HPC cluster health, including:
- Node availability and state management
- Network connectivity and communication
- Perform controlled maintenance activities such as:
- OS patching
- Rolling reboots
- Platform upgrades
- Execute routine Windows Server operations, including:
- OS patching and security updates
- Service monitoring
- Disk, CPU, memory, and network health checks
- Configure performance-sensitive OS settings to support high-demand workloads.
- Administer Microsoft SQL Server instances, covering:
- Backup and restore validation
- Job monitoring and scheduling
- Index and statistics maintenance
- Transaction log and capacity monitoring
- Troubleshoot cross-stack issues spanning:
- HPC scheduler and compute nodes
- Windows operating system and services
- SQL Server performance and availability
- Manage service accounts, permissions, and security configurations across platform components.
- Support incident, change, and problem management activities in line with IT service management processes.
- Maintain and update:
- Operational runbooks
- Health check procedures
- Disaster recovery documentation
- Coordinate with application, infrastructure, and database teams during critical processing windows.
- Plan and execute Windows and SQL Server workload migrations across datacenters, including:
- Dependency analysis
- Cutover coordination
- Post-migration validation
- Assess SQL Server workloads for migration to:
- Azure SQL Database
- Azure SQL Managed Instance
- Azure Virtual Machines
- Identify and remediate compatibility gaps during migration exercises.
- Support data migration, cutover, rollback planning, and post-migration performance stabilization across:
- Onpremises environments
- Cloud platforms
- Coordinate decommissioning of legacy servers following migrations.
- Update runbooks, monitoring, and support procedures after platform changes.
- Strong experience with Windows Server administration
- Hands-on expertise in Microsoft SQL Server
- Experience supporting Microsoft HPC environments
- Solid understanding of performance tuning, monitoring, and troubleshooting
- Experience with data center and cloud (Azure) migrations
- Ability to work independently and own end-to-end responsibilities