What are the responsibilities and job description for the Observability & Monitoring Engineer position at capgemini?
We are seeking a highly skilled Observability & Monitoring Engineer with deep hands-on experience in Dynatrace, Grafana, and modern telemetry best practices. This role is responsible for designing, implementing, and optimizing end to end observability solutions that enhance reliability, reduce MTTD/MTTR, and provide actionable insights into application and infrastructure performance.
The ideal candidate has strong technical expertise across metrics, logs, traces, and synthetics, along with the ability to collaborate with engineering, SRE, DevOps, and architecture teams to elevate monitoring maturity.
Key Responsibilities
Monitoring, Observability & Telemetry:
Design, implement, and maintain full stack observability using Dynatrace, including distributed tracing, Real User Monitoring (RUM), synthetics, custom events, and dashboards.
Build high quality Grafana dashboards using Prometheus, Loki, Tempo, or other data sources to visualize service health and business KPIs.
Define and enforce telemetry standards across metrics, logs, traces, and events to ensure high signal to noise ratio and consistent instrumentation.
Develop and maintain SLOs, SLIs, Error Budgets, and reliability scorecards for critical business services.
Configure and optimize alert thresholds, alert routing, and auto remediation workflows to minimize noise and improve MTTD/MTTR.
Engineering & Automation:
Automate monitoring setup and configuration using scripts, APIs, IaC (Terraform), or Dynatrace Configuration as Code.
Create synthetic monitoring scripts and custom metrics ingestions.
Build reusable monitoring templates for services, APIs, user journeys, and infrastructure components.
Implement correlation ID frameworks and end to end transaction tracing across microservices and cloud environments.
Performance Engineering:
Conduct performance analysis using Dynatrace PurePath, flame graphs, and execution traces.
Diagnose memory leaks, thread contention, CPU anomalies, network issues, and dependency bottlenecks.
Partner with AppDev teams to embed observability during the SDLC.
Cross Team Collaboration:
Work closely with SRE, Platform Engineering, AppDev, and Cloud teams to enhance reliability and availability.
Provide technical guidance on telemetry design, instrumentation patterns, and observability adoption.
Present insights, trends, and recommendations to senior leadership.
Required Technical Skills
Dynatrace Expertise (Hands On)
Dynatrace One Agent deployment/configuration (Linux/Windows/Kubernetes).
Deep experience with:
Distributed tracing (PurePath)
RUM & Synthetic Monitoring
Custom metrics ingestion (via API/StatsD/Open Telemetry)
Problem detection, Davis AI, anomalies, baselining
DQL (Dynatrace Query Language)
Dashboards & Notebooks
SLO configuration
Experience with Dynatrace Managed or SaaS environments.
Grafana Expertise:
Building advanced dashboards using:
Grafana, Dynatrace, Elk
CloudWatch
Strong query skills (DQL, LogQL, SQL or Elastic DSL).
Experience configuring alert rules, contact points, and alerting pipelines.
Telemetry & Observability Best Practices
Strong understanding of Open Telemetry (OTel) specification and instrumentation.
Knowledge of telemetry pipelines: collectors, processors, exporters.
Expertise in:
Metrics cardinality management
Log enrichment & structured logging
Distributed tracing design
Business transaction tracing
Sampling & retention strategies
Experience standardizing observability across microservices and hybrid environments.
Monitoring & Reliability
Hands on experience with:
Cloud-native monitoring stacks (EKS, AKS, GKE)
Logging systems: Splunk, ELK, Loki, Datadog Logs (any)
Ability to create SLO/SLI models aligned with business objectives.
Strong understanding of SRE principles and operational excellence KPIs.
API and Webhook automation for alerts and dashboard provisioning.
Preferred Qualifications
5 years in Observability, SRE, Performance Engineering, or Monitoring roles.
Dynatrace certification or Grafana Observability Stack certification.
Experience building correlation ID standards and E2E trace stitching.
Strong communication skills for leadership reporting and technical documentation.
Soft Skills
Strong analytical and troubleshooting mindset.
Ability to influence teams and drive observability adoption.
Clear communicator who can translate technical insights into business impact.
Ownership mindset and commitment to reliability and performance excellence.
The pay range that the employer in good faith reasonably expects to pay for this position is $32.36/hour - $50.56/hour. Our benefits include medical, dental, vision and retirement benefits. Applications will be accepted on an ongoing basis.
Tundra Technical Solutions is among North America’s leading providers of Staffing and Consulting Services. Our success and our clients’ success are built on a foundation of service excellence. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Unincorporated LA County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: client provided property, including hardware (both of which may include data) entrusted to you from theft, loss or damage; return all portable client computer hardware in your possession (including the data contained therein) upon completion of the assignment, and; maintain the confidentiality of client proprietary, confidential, or non-public information. In addition, job duties require access to secure and protected client information technology systems and related data security obligations.
Salary : $32 - $51