Demo

Senior Platform Engineer (Observability & Telemetry)

OneMain Financial
Baltimore, WA Full Time
POSTED ON 5/25/2026
AVAILABLE BEFORE 6/22/2026
We’re seeking a Senior Platform Engineer (Observability & Telemetry) to join a high‑performing Monitoring Engineering team within a fast‑paced financial technology organization. In this role, you will apply SRE principles to design, build, and evolve monitoring and observability capabilities that ensure the reliability, performance, and operability of core applications and infrastructure.

You will partner closely with application, platform, and development teams to implement data‑driven alerting, SLO/SLA-based monitoring, telemetry pipelines, dashboards, correlations, and automated remediation. Your work will directly improve system reliability, reduce MTTR, and enhance enterprise‑wide operational insight.

This role requires strong analytical thinking, systems engineering discipline, and a proactive approach to identifying risks, preventing incidents, and driving continuous improvement across the production ecosystem.

Key Responsibilities

Design, Build, and Maintain Monitoring & Observability Solutions

  • Architect, deploy, and operate OpenTelemetry‑based telemetry pipelines, including instrumentation standards, collector configurations, sampling strategies, and routing to Elastic and other backends.
  • Develop and maintain instrumentation, telemetry, and alerting for the Enterprise Monitoring Center using industry‑leading tools, such as:
    • Grafana, OpsRamp, ElasticStack, BigPanda
    • AWS CloudWatch, Azure Monitor
  • Drive observability standards and best practices across multiple engineering teams through influence, documentation, and partnership rather than direct authority.
  • Apply SRE best practices to ensure measurable SLIs/SLOs, reliability dashboards, and health indicators for critical systems.
  • Integrate and manage OpenTelemetry for distributed tracing and telemetry data collection, enabling end‑to‑end visibility of business‑critical transactions.
Collaboration & Project Participation

  • Collaborate with application development teams to define and document observability requirements for each project or release, ensuring accurate and actionable monitoring and tracing are in place for every step of business‑critical workflows.
  • Embed reliability considerations early in the SDLC, including SLO definitions, instrumentation needs, and failure‑mode awareness.
  • Partner with product and engineering teams to use SLOs and error budgets to guide release decisions, prioritization, and toil reduction.

Alerting & Escalation Process

  • Define and maintain standardized alert payloads per engineering guidelines, ensuring alerts are actionable.
  • Partner with Level 2 and Level 3 support teams to reflect process changes in monitoring dashboards.
  • Maintain and optimize thresholds, ensuring seamless escalations via BigPanda as the central alert hub.

Dashboard Creation & Maintenance

  • Create and maintain intuitive, actionable dashboards for the Enterprise Monitoring Center and other finance teams.
  • Ensure dashboards are effectively monitored by Level 1 teams, presenting clear, actionable data that reduces MTTR.

Documentation, Governance & Reliability Standards

  • Develop and maintain technical documentation, runbooks, diagnostic guides, and observability standards across the enterprise.
  • Evaluate and refine release, deployment, and monitoring processes to support consistent, reliable delivery pipelines.
  • Mentor junior engineers and promote a culture focused on reliability, automation, and operational excellence.

Reliability Engineering, Automation & Continuous Improvement

  • Build automation frameworks for monitoring, alerting, self‑healing workflows, and incident response to reduce toil and improve MTTR.
  • Drive system optimization through capacity analysis, performance tuning, and proactive detection of reliability risks.
  • Contribute to the automation of routine operational tasks to improve system reliability and engineer quality of life.
  • Advocate for and implement observability best practices across engineering teams.
  • Define, implement, and operationalize SLIs, SLOs, and error budgets for critical services.
  • Participate in and improve incident response processes, including detection, triage, escalation, and recovery.

Qualifications

Education bachelor’s in computer science, IT, or related field.

Experience

  • 5 years of experience in software, systems, or reliability engineering roles, with multiple years of hands‑on experience owning production observability, monitoring, and SLOs in distributed systems.

Required Skills

  • Deep experience building scalable, reliable monitoring and observability solutions, including instrumentation, alerting, dashboarding, and configuration across large, complex environments.
  • Hands‑on expertise and proficency with modern monitoring and observability tools, (e.g., OpsRamp, Grafana, Elastic, CloudWatch, Azure Monitor BigPanda (AIOps), and strong knowledge of metrics, logs, traces, and OpenTelemetry.
  • Strong scripting and programming capability (Bash, PowerShell, and one or more languages such as Python, C-family, or JavaScript) to automate telemetry, alerting, and platform workflows.
  • Strong expertise with cloud platforms (AWS and/or Azure) and container orchestration systems (Kubernetes, Docker).
  • Deep hands‑on experience with Elastic Observability (APM, Logs, Metrics, Traces)
  • Understanding of distributed systems fundamentals, including networking, security, databases, DevSecOps principles, and performance/capacity engineering.
  • Strong communication skills, with the ability to clearly explain complex technical topics to both technical and non‑technical audiences.
  • Exceptional problem‑solving and troubleshooting abilities, especially in high‑pressure or time‑sensitive environments.
  • Effective prioritization and multitasking, able to manage competing deadlines while maintaining quality and focus.
  • Proven cross‑functional collaboration, working seamlessly with diverse teams in large, complex IT environments and driving continuous improvement across systems.

Preferred Qualifications

  • Experience with CI/CD pipelines and tools like Jenkins, GitHub, GitLab CI, or CircleCI
  • Experience querying, manipulating, and visualizing time‑series data.
  • Familiarity with Infrastructure as Code tools (e.g., Ansible, Terraform).
  • Knowledge of microservices architecture and event-driven systems.
  • Working knowledge of REST APIs, JSON, and ServiceNow.
  • Experience with cloud monitoring—particularly AWS or Azure.

Salary.com Estimation for Senior Platform Engineer (Observability & Telemetry) in Baltimore, WA
$132,161 to $173,258
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Platform Engineer (Observability & Telemetry)?

Sign up to receive alerts about other jobs on the Senior Platform Engineer (Observability & Telemetry) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at OneMain Financial

  • OneMain Financial Wilmington, DE
  • We are seeking a Collections Workflow Analyst to join our team. This role will be responsible for creating efficiencies and streamlining our collections wo... more
  • 1 Day Ago

  • OneMain Financial Laveen, AZ
  • At OneMain, Loan Sales Specialists empower customers by listening to their needs and providing access to friendly, fast, and affordable financing for life’... more
  • 1 Day Ago

  • OneMain Financial Greenwood, IN
  • At OneMain, Loan Sales Specialists empower customers by listening to their needs and providing access to friendly, fast, and affordable financing for life’... more
  • 1 Day Ago

  • OneMain Financial Baltimore, WA
  • Key Responsibilities (Top 5) Architecture Design & Domain-Aligned Solutions Define and evolve scalable, resilient architectures for communication platforms... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Senior Platform Engineer (Observability & Telemetry) jobs in the Baltimore, WA area that may be a better fit.

  • New Civil Engineer Mayo, MD
  • Information About The Organisation Mayo County Council is at the heart of the local community and is the key provider of economic and social development in... more
  • 25 Days Ago

  • Platform Aerospace Hollywood, MD
  • JOB TITLE: EMBEDDED SOFTWARE ENGINEER I DEPARTMENT: SOFTWARE ENGINEERING REPORTS TO: SENIOR SOFTWARE ENGINEER CLASSIFICATION: EXEMPT PRIMARY FUNCTION We ar... more
  • 6 Days Ago

AI Assistant is available now!

Feel free to start your new journey!