What are the responsibilities and job description for the Staff Software Infrastructure Engineer position at Shimento, Inc.?
Staff Infrastructure Engineer
Location - Sunnyvale CA / Onsite Role
Contract 6 months plus
Job description
About the Role:
We are seeking a Staff Software Infrastructure Engineer to play a critical role in managing client’s fleet operations, focusing on foundational tools for provisioning and reprovisioning servers with a strong emphasis on Infrastructure as code. The role includes building automation tools, troubleshooting hardware, and scaling operations to support high growth. The candidate will be integral in transitioning to Kubernetes and optimizing client’s infrastructure.
This position offers the opportunity to work on cutting-edge technologies within a world-class team and contribute directly to the success of a rapidly growing company while making a significant impact on the global energy landscape.
What You''''ll Be Doing:
- Manage and maintain day-to-day operations of client’s cloud infrastructure.
- Develop automation tools to streamline server provisioning and reduce SLA times.
- Scale infrastructure to support mass deployments (80-100 servers simultaneously).
- Troubleshoot hardware issues, especially with GPUs, and liaise with vendors.
- Transition client’s environment to Kubernetes and containerized workflows.
What You’ll Bring to the Team:
- Solid hardware experience and GPU troubleshooting expertise.
- Strong Linux background
- Knowledge of PXE booting and server provisioning (bare metal)
- Experience with BMC/IPMI, BIOS, and enterprise-grade server management.
- Kubernetes proficiency (admin or developer).
- Familiarity with containerization technologies (Docker preferred).
- Experience with version control systems ( Gitlab )
- Problem solving skills - able to analyze complex technical issues and develop effective solutions
- Strong communication and collaboration skills to work effectively with cross-functional teams
- Values: Embody the Company values.
- Experience with MAAS (nice to have)
- Proficiency in Python or Golang (preferred language) (nice to have)
- Kubernetes administration and deployment experience (nice to have)
- Experience with Ansible and Terraform (nice to have)