What are the responsibilities and job description for the Lead Performance Reliability Engineer position at Insight Global?
Insight Global is hiring for a Lead Performance Engineer to support a large enterprise client with complex, mission‑critical applications. This role was created at the request of senior production support leadership to proactively identify and resolve performance issues before they impact the business.
We are seeking a senior, hands‑on performance expert who thrives in production environments, lives in performance data, and can diagnose issues across applications, integrations, infrastructure, and networks.
What You’ll Do
- Proactively monitor production systems to identify performance degradation before users report issues
- Analyze application performance KPIs including response times, percentiles (P90/P95), throughput, and error rates
- Investigate performance issues across multiple layers:
- Application layer (ERP and non‑ERP systems)
- APIs and integrations
- Network
- Infrastructure / hardware
- Translate non‑functional requirements into clear, actionable performance test scenarios
- Partner closely with:
- Production Support teams for ongoing performance issues
- Project teams during go‑lives and hypercare
- Provide clear, data‑driven insights to engineering and leadership during high‑impact incidents
- Support upcoming enterprise go‑lives and major releases in a fast‑moving production environment
This role sits in a high‑visibility “hot seat” during critical incidents — leadership relies on this person for real‑time performance insight and direction.
What We’re Looking For
Must‑Have (Core to Success)
- Strong performance monitoring and analysis experience in production environments
- Deep understanding of application performance KPIs, baselines, percentiles, averages, and trends
- Ability to read and interpret performance reports and turn data into actionable recommendations
- Solid understanding of:
- HTTP fundamentals
- Network basics
- Request/response behavior
- Experience with modern application architectures:
- Microservices
- APIs
- Hands‑on experience with observability and monitoring tools, such as:
- Dynatrace
- OpenTelemetry
- Prometheus
- Grafana
- Elastic
- Loki / Tempo
- Strong analytical mindset with the ability to think beyond a single platform and diagnose issues holistically
ERP / Platform Experience (Flexible by Design)
- Experience supporting large ERP platforms is highly valuable
- Oracle Fusion Cloud experience is preferred, particularly in Finance, Supply Chain, or integration environments
- PL/SQL knowledge is a plus
- However: Candidates with SAP, PeopleSoft, or other enterprise ERP platforms will be strongly considered if performance monitoring and reliability are their core strengths
👉 When tradeoffs are required, performance engineering and reliability experience outweigh platform‑specific expertise.
Backgrounds That Tend to Do Well
We actively encourage candidates from the following backgrounds:
- Site Reliability Engineers (SREs) with observability ownership
- DevOps Engineers focused on monitoring, reliability, and system performance
- Senior / L3 Production Support Engineers with heavy performance troubleshooting exposure
- Performance Engineers who have supported large, complex enterprise systems
What This Role Is Not
- Not a people management position
- Not a testing‑only role
- Not remote (5 days onsite)
Location & Work Model
- Cleveland, OH – 100% onsite
- Relocation support available
- Remote work considered only as a last resort if no qualified local or relocating candidates are identified
Why This Role Matters
- Enterprise production performance issues are ongoing and business‑impacting
- This role plays a critical part in stabilizing complex systems
- High executive visibility and immediate impact
- Opportunity to shape how performance is monitored and addressed at scale
Salary : $120,000 - $154,000