What are the responsibilities and job description for the SRE – Application/Platform position at Ampstek?
SRE – Application/Platform
Bloomfield , CT-Hybrid
Contract
independent visa only(USC/GC)
Experience in monitoring, troubleshooting, performance tuning, capacity planning, and automation, along with strong exposure to distributed data processing frameworks like Spark, Flink, and Kafka.
Hadoop Cluster Administration & Operations
• Ensure 24x7 system reliability, incident response, and operational readiness for global applications.
• Lead troubleshooting efforts during outages/performance incidents; perform root cause analysis (RCA) and implement preventive actions.
• Define and maintain operational metrics and reliability goals (availability, latency, throughput, resource utilization).
• Improve system stability via proactive monitoring, alerting, and capacity planning
• Big Data & Streaming Support
• Support deployments and operations across:AWS Cloud, Kubernetes, containerized environments
• Implement and maintain cluster reliability in Kubernetes environments: Resource quotas, access control, permissions, namespace management
Contact:
Snehil Mishra
📧 snehil@ampstek.com
📞 Desk: 609-360-2673 Ext. 125
🌐 www.ampstek.com