What are the responsibilities and job description for the Infrastructure Architect with HashiCorp & Nomad OSS position at Saransh Inc?
Role: Infrastructure Architect with HashiCorp & Nomad OSS
Location: Boise, ID (Onsite)
Job Type: Contract
Role Overview
Location: Boise, ID (Onsite)
Job Type: Contract
Role Overview
- We are looking for a Senior Infrastructure Architect to lead the design and implementation of our global orchestration platform using HashiCorp Nomad OSS.
- Unlike traditional static environments, you will architect "nomadic" clusters - Need to create nomadic clusters and classify the clusters, test and guide the team —highly portable, scalable, and ephemeral environments that can run anywhere.
- You will be responsible for defining how we classify these clusters (e.g., by sensitivity, workload type, or geographic region) and establishing the testing protocols that ensure our "orchestrator of orchestrators" remains rock solid.
- Architectural Design: Build and maintain production-grade Nomad OSS clusters across hybrid-cloud and edge environments.
- Cluster Classification: Design a taxonomy for cluster types (e.g., Batch-heavy, Service-mesh integrated, Edge-compute) using Nomad Namespaces, Node Attributes, and Meta-stanzas to ensure workloads land on the right infrastructure.
- Infrastructure as Code (IaC): Lead the automation of cluster provisioning using Terraform and Packer, ensuring clusters are "nomadic" and can be recreated in minutes.
- Testing & Validation: Establish "Chaos Engineering" and performance benchmarking suites for Nomad. You will guide the team in testing leader election, client heartbeats, and bin-packing efficiency.
- Security & Governance: Implement zero-trust security using HashiCorp Vault for secret injection and Consul for secure service discovery.
- Team Leadership: Act as the primary SME, conducting code reviews and guiding DevOps engineers on Nomad-specific patterns like Job Lifecycle management and Task Driver selection.
- Orchestration: Expert-level knowledge of Nomad OSS (Job stanzas, scheduling algorithms, and federation).
- Classification Tools: Deep understanding of how to use Node Metadata and Constraints to classify hardware for specific workloads (GPU vs. CPU, High-IOPS vs. Standard).
- HashiCorp Stack: Proven experience integrating Nomad with Consul (service discovery/mesh) and Vault (dynamic secrets).
- Networking: Strong grasp of gossip protocols (Serf), Raft consensus, and CNI (Container Network Interface).
- Testing Frameworks: Experience with automated testing for infrastructure (e.g., Terratest or Python-based cluster validation scripts).
- Design for Failure: You believe that everything fails eventually and architect clusters to be resilient to regional outages.
- Classified Resource Management: Ability to manage "Multi-tenant" environments where different teams or classifications of data share the same physical cluster safely.
- Education: A passion for teaching; you don't just build the cluster, you enable the whole team to master the CLI and UI.