What are the responsibilities and job description for the DevOps & Site Reliability Lead position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, VCS Digital LLC, is seeking the following. Apply via Dice today!
Role: DevOps & Site Reliability Lead-Retail Devops
Location: Deerfield, IL
Type: Full-time role with TCS (TATA Consultancy Services)
Job Description:
Technology and Programming (Expert Level)
Thanks & Regards
Himanshu Shahi
VCS Digital LLC
Cell: 1
Email:
Web:
Role: DevOps & Site Reliability Lead-Retail Devops
Location: Deerfield, IL
Type: Full-time role with TCS (TATA Consultancy Services)
Job Description:
Technology and Programming (Expert Level)
- Strong proficiency in Java full stack developer
- Object-Oriented programming principles and concepts
- Hands-on experience with Spring Framework (Spring Boot, Spring MVC, Spring Security)
- Knowledge if RESTful API development
- Experience with database like Oracle, DB2, MySQL
- Proficiency in Payment Switch BASE24 EPS, C , AS400 and Python is also added advantage
- Must have domain experience on Retail Point of Sale/Payment Systems/Merchandising/Inventory/Logistics area
- Expertise in Microsoft Azure, including:
- Compute (VMs, App Services, Azure Container Apps)
- Containers & Orchestration (AKS, Docker)
- Networking (VNETs, Private Endpoints, Application Gateway, Load Balancers)
- Storage, Azure Key Vault, Azure Monitor, Log Analytics
- Proven experience designing enterprise‑grade, highly available cloud platforms
- Advanced experience with Azure DevOps and CI/CD pipeline architecture
- Strong scripting skills (PowerShell, Bash)
- GitOps concepts, branching strategies, release orchestration
- Ownership of platform reliability, resiliency, and performance
- Definition and governance of:
- SLIs, SLOs, SLAs
- Error budgets and reliability metrics
- Advanced observability strategy, designing and implementation:
- Metrics, logs, traces, alerts, dashboards using Dynatrace
- Incident response leadership, RCA facilitation, and long‑term remediation planning
- Experience operating 99.9%–99.99% availability systems
- Secure cloud design using Key Vault, managed identities, RBAC
- Cost optimization (FinOps mindset) across cloud infrastructure
- Act as Lead SRE for client''''s Retail platforms, owning reliability and stability outcomes
- Define and enforce SRE standards, best practices, and operating models
- Architect and govern highly available, scalable cloud platforms
- Lead the design and implementation of CI/CD and IaC strategies
- Establish proactive monitoring, alerting, and incident prevention mechanisms
- Own major incident leadership, RCA execution, and corrective action tracking
- Partner with application, security, and architecture teams to build reliability by design
- Drive automation to reduce toil and improve operational efficiency
- Mentor and coach SRE and DevOps engineers across teams
- Influence roadmap decisions with a reliability, scalability, and cost lens
Thanks & Regards
Himanshu Shahi
VCS Digital LLC
Cell: 1
Email:
Web: