What are the responsibilities and job description for the Incident Manager position at MSR Technology Group?
Incident Manager
Mount Laurel, NJ or West Chester PA
Full-time
Must Have Technical/Functional Skills
Incident Management, SRE and operations engineering, reliability architecture, Automation and observability, executive communication
Roles & Responsibilities
Incident Manager - Resources to provide technical leadership for enterprise wide, high severity incidents, problem investigations, and high risk changes, while shaping reliability strategy, governance, and operational standards across complex, distributed platforms.
- Drive Incident resolution management by directing cross functional teams through high impact outages, systemic problem resolution, and large scale change events.
- Creating scripts in ELK, Grafana, AppDynamics, COP
- Auto-executing predefined queries in ELK, Grafana, AppDynamics, COP for real-time issues
- Attaching live query outputs (metrics, logs, traces) directly to alerts/incidents
- Eliminating manual tool navigation for IM and Alert teams
- Enhancing alert systems with contextual intelligence, including metric deviations and anomaly trends, relevant log snippets and patterns, and identifying affected CIs and downstream impacts