Demo

Lead Software Engineer (Observability & Telemetry)

griddable.io
Bellevue, WA Full Time
POSTED ON 4/28/2026
AVAILABLE BEFORE 5/15/2026
Description

Join the team responsible for innovating and maintaining the massive-scale, distributed systems that monitor Salesforce’s infrastructure.

This position is located in the Bellevue office and requires onsite presence.

The Network Visibility and Telemetry team is responsible for designing, building, and operating a set of systems and services which deliver metrics, telemetry and alerting for data center infrastructure (network, storage, etc). We are part of the Infrastructure Strategy Datacenter Operations organization, which is a dynamic, global team delivering and supporting technology infrastructure to meet the substantial growth needs of the business.

In this role, you will leverage your experience in building and deploying large-scale systems to automate systems services across all types of infrastructure (storage, network, server), enable the collection of infrastructure telemetry, make the infrastructure visible and accessible, and ensure that alerts are generated where action is needed.

Responsibilities

  • Design, build, and operate large-scale observability systems that deliver metrics, telemetry, and alerting across data center infrastructure including network and storage environments
  • Develop and maintain distributed services in Java and/or Python to enable automated collection of infrastructure telemetry at scale, ensuring full visibility into critical systems
  • Build and deploy automation solutions using tools such as Ansible, Puppet, or Chef to streamline infrastructure services across storage, network, and server environments
  • Publish and consume REST APIs to integrate telemetry pipelines and expose infrastructure data to downstream systems and stakeholders
  • Drive alerting frameworks that surface actionable signals from infrastructure telemetry, reducing noise and ensuring the right teams are notified when intervention is needed
  • Partner with a global, cross-functional Infrastructure Strategy Datacenter Operations team to support rapid growth, leveraging CI/CD practices (Jenkins), source control (Git), and Linux (RedHat) expertise to deliver reliable, scalable observability tooling
  • Build and ship high-quality, production-grade software using modern engineering practices, with AI as a core part of your development workflow by pushing the boundaries of AI development tools to deliver secure, optimized, and high-quality code.
  • Design and orchestrate complex systems where AI agents integrate seamlessly into human workflows, driving efficiency and innovation at scale.
  • Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably.
  • Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance

Required Skills

  • A related technical degree required.
  • 8 years of proven experience with supporting a codebase for distributed services implemented in Java and/or Python
  • Experience with automation of systems services and processes.
  • Excellent analytical and problem-solving skills
  • A long-standing practice of using Source Control (e.g. git) and unit testing
  • Experience in publishing and consuming REST APIs
  • CI/CD experience with Jenkins
  • Knowledge of Linux (RedHat) including configuration, packages, services, daemons, shells, and troubleshooting
  • Experience with configuration automation tools such as Ansible, Puppet, and/or Chef.
  • Experience in fast-paced, technical environments experiencing rapid growth and change
  • Ability to adapt, to be flexible, and to learn quickly in a dynamic environment
  • Excellent organizational skills including ability to prioritize tasks efficiently with high level of attention to detail
  • Ability to work under tight deadlines while coordinating several projects at a time and responding to changing business and technical conditions
  • A demonstrated, genuine AI-first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
  • Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows
  • Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.

Desired Skills

  • Experience with the monitoring and alerting of network infrastructure - routers, switches, load balancers, etc. - in a high-availability, always-on datacenter environment
  • Experience with the monitoring and alerting of storage infrastructure - switches, arrays, etc - in a high-availability, always-on data center environment
  • Experience with container orchestration systems, i.e., Docker and Kubernetes
  • Experience with Terraform, Helm, and Spinnaker.
  • Strong Network Engineering Skills: SNMP, BGP, OSPF or ISIS, LAN switching technologies, backbone, load balancers, IPv4/IPv6 addressing and subnetting.
  • Experience with application protocols and troubleshooting for the same (i.e., HTTP, HTTPS, TCP/UDP)
  • Experience with application databases and document stores, e.g. Elasticsearch, Cassandra
  • Experience in writing systems automation in a high level language such as python.
  • Experience building AI agents or LLM-powered tools for operational or infrastructure use cases (e.g., triage agents, anomaly detection, AI-assisted diagnosis)
  • Hands-on experience with AI agent infrastructure — such as agent runtimes, tool/function calling frameworks (e.g., MCP), secure execution environments, or context management for LLM-based systems

Salary.com Estimation for Lead Software Engineer (Observability & Telemetry) in Bellevue, WA
$111,706 to $134,641
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Lead Software Engineer (Observability & Telemetry)?

Sign up to receive alerts about other jobs on the Lead Software Engineer (Observability & Telemetry) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$146,673 - $180,130
Income Estimation: 
$176,149 - $220,529
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at griddable.io

  • griddable.io Indianapolis, IN
  • Description We're looking for a hybrid Design Engineer who shapes the visual direction of our blog and editorial content — then builds it into production-r... more
  • Just Posted

  • griddable.io San Francisco, CA
  • Description Senior Software Engineer, Frontend Builds & Tooling - Slack The Frontend Developer Experience team empowers developers to write, build, test, a... more
  • Just Posted

  • griddable.io San Francisco, CA
  • Description Job Details Role Overview The Director of Strategy & Operations leads strategic programs and drives alignment across Product, GTM, Operations, ... more
  • Just Posted

  • griddable.io San Francisco, CA
  • Description The Global Partner Initiatives & Strategy team is the connective tissue across Salesforce's Global Partnerships organization. We translate part... more
  • 2 Days Ago


Not the job you're looking for? Here are some other Lead Software Engineer (Observability & Telemetry) jobs in the Bellevue, WA area that may be a better fit.

  • Thrive Software Solutions Issaquah, WA
  • [Develop web applications, ensure system performance, validation and test methods. Communicate project updates and coordinate with cross-functional teams f... more
  • 21 Days Ago

  • TALENT Software Services Redmond, WA
  • Are you an experienced Software Engineer with a desire to excel? If so, then Talent Software Services may have the job for you! Our client is seeking an ex... more
  • 12 Days Ago

AI Assistant is available now!

Feel free to start your new journey!