What are the responsibilities and job description for the Project Manager -Incident Management position at OKAYA INFOCOM?
Project Manager -Incident Management - Mount Laurel, NJ or West Chester PA (5 days on-site) - Full Time
Job Description
Incident Manager
Must Have Technical/Functional Skills
Incident Management, SRE and operations engineering, reliability architecture, Automation and observability, executive communication
Roles & Responsibilities
Incident Manager - Resources to provide technical leadership for enterprise wide, high severity incidents, problem investigations, and high risk changes, while shaping reliability strategy, governance, and operational standards across complex, distributed platforms.
• Drive Incident resolution management by directing cross functional teams through high impact outages, systemic problem resolution, and large scale change events.
• Creating scripts in ELK, Grafana, AppDynamics, COP
• Auto-executing predefined queries in ELK, Grafana, AppDynamics, COP for real-time issues
• Attaching live query outputs (metrics, logs, traces) directly to alerts/incidents
• Eliminating manual tool navigation for IM and Alert teams
• Enhancing alert systems with contextual intelligence, including metric deviations and anomaly trends, relevant log snippets and patterns, and identifying affected CIs and downstream impacts