What are the responsibilities and job description for the Datadog Tester / Observability QA Engineer position at OVA.Work?
Job Title
Datadog Tester / Observability QA Engineer
Job Summary
We are looking for a Datadog Tester responsible for validating monitoring, logging, tracing, and alerting implementations using Datadog. The role ensures accurate observability, correct alerting, and reliable dashboards across applications, APIs, and infrastructure.
Key Responsibilities
Monitoring & Metrics Testing
Datadog Tester / Observability QA Engineer
Job Summary
We are looking for a Datadog Tester responsible for validating monitoring, logging, tracing, and alerting implementations using Datadog. The role ensures accurate observability, correct alerting, and reliable dashboards across applications, APIs, and infrastructure.
Key Responsibilities
Monitoring & Metrics Testing
- Validate application, API, and infrastructure metrics collected in Datadog
- Ensure correct host, container, and service-level metrics
- Verify metric thresholds, anomalies, and aggregation accuracy
- Validate log ingestion from applications, APIs, servers, and cloud services
- Ensure log parsing, indexing, tagging, and retention policies
- Verify log search, filters, and correlation with metrics
- Test distributed tracing for APIs and microservices
- Validate service maps, latency, error rates, and throughput
- Ensure trace-to-log and trace-to-metric correlation
- Test alerts, monitors, and notifications (Email, Slack, PagerDuty, etc.)
- Validate alert conditions, severity levels, and escalation rules
- Perform negative testing to avoid false positives/negatives
- Validate Datadog dashboards for accuracy and usability
- Ensure correct widgets, queries, and time-series data
- Verify role-based access and dashboard sharing
- Test Datadog integrations with cloud providers (AWS/Azure/GCP), Kubernetes, Docker, databases, and CI/CD tools
- Validate environment separation (Dev, QA, Stage, Prod)
- Work with Dev, DevOps, SRE, and Product teams
- Report observability gaps and monitoring defects
- Assist in production issue root cause analysis (RCA)