What are the responsibilities and job description for the Observability & Network Monitoring Engineer position at Enterprise Solutions Inc.?
Job Title: Observability & Network Monitoring Engineer
Work Location: Denver, CO 80211
Contract duration: 6 months
Job Details:
Please find below updated JD:
Responsible for monitoring, triaging, troubleshooting, and analyzing large-scale network and customer premises equipment (CPE) data using observability and analytics platforms such as Splunk, Kibana (Elastic Stack), Instana, and Grafana. The role focuses on proactive issue detection, fallout report analysis, root cause analysis (RCA), and ensuring service reliability across wireless networks, routers, and CPE devices.
Key Responsibilities
Observability & Monitoring
• Configure, develop, and maintain dashboards and alerts using Splunk, Kibana, Grafana, and Instana to monitor system, application, and network health.
• Perform real-time monitoring of KPIs, SLAs, latency, packet loss, device availability, and performance trends.
• Correlate metrics, logs, traces, and events to proactively identify anomalies and service degradation.
Incident Management & Triaging
• Perform L1/L2/L3 triaging of incidents related to CPE, routers, and wireless infrastructure.
• Analyze alerts, logs, and telemetry data to quickly isolate issues and minimize MTTR.
• Support major incident bridges, provide regular status updates, and assist in service restoration.
Troubleshooting & Root Cause Analysis
• Troubleshoot network, device, and service-level issues using log analysis and performance metrics.
• Conduct Root Cause Analysis (RCA) and document findings with corrective and preventive actions (CAPA).
• Work closely with Network, Platform, and DevOps teams to resolve complex issues.
Fallout Report & Data Analysis
• Analyze fallout reports related to:
o CPE provisioning failures
o Wireless connectivity drops
o Router/device configuration errors
• Identify recurring failure patterns, data trends, and systemic issues.
• Recommend process, configuration, or automation improvements based on insights.
Wireless, Router & CPE Data Analysis
• Monitor and analyze data from wireless access points, routers, gateways, and CPE devices.
• Validate device onboarding, firmware upgrades, configurations, and health status.
• Identify performance bottlenecks related to RF, bandwidth, signal strength, and device capacity.
Reporting & Automation
• Create operational and executive reports using Splunk and Grafana.
• Automate recurring analysis and reporting using queries, alerts, and scheduled jobs.
• Maintain documentation, runbooks, SOPs, and knowledge bases.
Required Skills & Experience
Technical Skills
• Hands-on experience with:
o Splunk (Search Processing Language – SPL, dashboards, alerts)
o Elastic Stack / Kibana
o Instana (APM, infrastructure monitoring)
o Grafana (dashboards, data sources, alerting)
• Strong experience in log analysis, metrics correlation, and observability best practices.
• Solid understanding of:
o Networking fundamentals (TCP/IP, DNS, DHCP, routing, switching)
o Wireless technologies (Wi Fi, RF basics, signal metrics)
o Routers, gateways, and CPE devices
Analytical & Operational Skills
• Experience with incident triage, troubleshooting, and RCA.
• Strong data analysis skills to interpret large datasets and fallout reports.
• Ability to identify trends, patterns, and anomalies in network/device data.
Soft Skills
• Strong communication and documentation skills.
• Ability to work under pressure in a 24x7 operational environment.
• Collaborative mindset with cross-functional teams.
Salary : $45 - $50