What are the responsibilities and job description for the Sr. DevOps Engineer (Storage Platform) position at Prudent Technologies and Consulting?
Job Summary
We are seeking a highly experienced Sr DevOps Engineer Storage Platforms to build, automate and operate large-scale Software Defined Storage (SDS) and Kubernetes platforms in a private cloud environment. This role focuses on Storage Engineering with Infrastructure-as-Code and GitOps practices while ensuring scalability, resilience, and performance.
This is a deeply technical role requiring expert-level understanding of Software Defined Storage, Kubernetes and extensive working knowledge on Linux Operating systems. You will also collaborate with platform and SRE teams to maintain secure, performant, and multitenant-isolated services that serve high-throughput, mission-critical applications.
Key Responsibilities
- Deploy, automate, and operate large-scale Software Defined Storage architectures across private and public cloud regions within ITIL methodology.
- Deploy and support enterprise storage platforms (Pure Storage, HPE, NetApp) and SDS solutions (Ceph, Longhorn).
- Integrate self-service storage workflows for Kubernetes CSI and OpenStack consumers (VM and Baremetal).
- Implement and manage backup solutions (preferably Rubik).
- Build and maintain Infrastructure-as-Code for storage platforms using Ansible, Terraform, Helm and Git, with Python/Bash automation.
- Implement CI/CD pipelines for infrastructure updates, patching, upgrades, testing, and rollback.
- Implement and improve monitoring, alerting, and observability for storage systems (capacity, latency, IOPS, recovery health) using GitOps and tools such as Prometheus, Loki, and Grafana.
- Perform deep troubleshooting across storage, Kubernetes, hypervisors, networking, and Linux systems.
- Develop and maintain technical documentation, architecture diagrams, operational procedures, and runbooks
- Participate in on-call rotations, incident response, and root cause analysis.
- Collaborate globally on change management, documentation, and operational best practices.
Must Have
- 6 years of experience managing enterprise storage and Kubernetes platforms on Linux.
- Strong hands-on experience with SDS solutions (Ceph, Longhorn) and storage migrations from legacy systems.
- Expertise with block, file, and object storage, including Fibre Channel (Cisco MDS) and IP-based protocols (NVMe-oF or iSCSI farbics).
- Expert knowledge of Kubernetes and Linux systems (Ubuntu, RHEL/CentOS).
- Proficiency with Infrastructure-as-Code (IaC) (Ansible, Terraform) for provisioning storage and backup schedules
- Expertise in backup technologies (preferably Rubik)
- Strong scripting skills in Python and Bash (Golang (GO) a plus).
- Experience operating 24x7 mission-critical production environments.
- Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack).
- Strong written and verbal communication skills.
- Proficiency with Git, CI/CD pipelines, and automated testing frameworks
- Ability to write technical documentation and contribute to community wikis or knowledge bases.
- Bachelor s degree in computer science or equivalent professional experience.
Nice to Have
- OpenStack Cinder multi-backend administration.
- Backup platforms (Rubrik).
- Understanding of CIS/NIST security and infrastructure lifecycle management.
- ITIL Foundation/advanced certifications in support of ITSM standard methodology.
- Background in telco, edge cloud, or large enterprise environments.
- CNCF Certified Kubernetes Administrator (CKA), Certified Kubernetes Security Specialist (CKS) or Red Hat specialist in Ceph Storage Administrator (EX125) certifications.
- Master s degree in computer science, IT, Engineering, or a related field preferred; equivalent experience and relevant industry certifications will also be considered