Demo

Senior Site Reliability Engineer

Jobs via Dice
Fairfax, VA Full Time
POSTED ON 4/22/2026
AVAILABLE BEFORE 5/21/2026
Job Description

ECS is seeking a Senior S ite Reliability Engineer to work in our Fairfax, VA office.

ECS is seeking talented professionals to join our successful and growing team in building the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. The CDM Program is the Cybersecurity and Infrastructure Security Agency's (CISA) dynamic approach to strengthening the cybersecurity of Federal networks and systems through better awareness and visibility into their security posture and cyber threats. ECS is responsible for designing, building, deploying, operating, and maintaining a complete 'Data Services' solution which includes the collection, normalization, visualization, and sharing of cyber data from more than 100 Federal agencies. The CDM Data Services product is an integrated suite of multiple Commercial Off the Shelf (COTS) products, software configuration packages, and custom code which work together to operate as an integrated solution tailored to meet Department of Homeland Security (DHS) requirements.

We are seeking professionals who thrive in a dynamic, fast-paced, and highly collaborative environment where problem-solving, critical thinking, and a holistic approach to serving the mission are key. Our program operates within the Scaled Agile Framework (SAFe). An aptitude and enthusiasm for continuous learning, improvement, and cyber security is a must!

Role & Responsibilities :

ECS is seeking a talented Senior Site R eliability Engineer ( SRE ) to play a key role in defining, implementing, and growing our SRE practice to ensure the reliability, availability, and performance of our critical production environments.

The Senior SRE will contribute to a culture of continuous improvement, identifying areas for enhancement, and driving initiatives to improve system reliability, scalability, and efficiency .

The successful candidate will have demonstrated hands-on experience design ing , implement ing , and maintain ing solutions to ensure that systems, including infrastructure and applications , are resilient, highly available , and performant . The Senior SRE will also play a critical role in defining and measuring the Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for our solution.

The Senior SRE will be responsible for set ting up comprehensive logging, monitoring , and alerting s olutions using the Elastic s tack and other tools as necessary to ensure the continuous performance of services . Additionally, they will respond to incidents, perform root cause analys e s, and implement solutions to prevent re o c c urrence s . The Senior SRE will work in close collaboration with other SRE team members, d evelopers, t esters, i nfrastructure engineers, DevOps engineers, and other stakeholders to integrate reliability and observability into the software development lifecycle.

Required Skills

  • 6 years of experience as a Site Reliability Engineer (SRE) or equivalent
  • 6 years of demonstrated experience designing, implementing , and maintaining observability solutions to include logging, monitoring, and alerting
  • 6 years of hands-on experience with SRE tools (e.g., Elastic, Prometheus, Grafana, Splunk , etc.)
  • 3 years defining and measuring SLOs and SLIs
  • 3 years of relevant experience using cloud platforms (AWS GovCloud preferred )
  • 3 years of hands-on programming or scripting (e.g., Python, Bash, etc.)
  • Strong knowledge of microservices, containerization, and orchestration tools (Docker, Kubernetes)
  • Proven ability to collaborate with cross-functional teams (development, testing, and product) to integrate reliability and observability into the software development lifecycle
  • Strong problem-solving and analytical skills
  • Proactive, detail-oriented approach to identifying inefficiencies and implementing improvements .
  • Proficient in developing Synthetic monitoring scripts using typescript.

Desired Skills

  • Bachelor's degree in Computer Science , Engineering, or a related field (or 4 additional years of related experience)
  • Experience working in an Agile/ SAFe environment using ALM tools (Jira, Confluence, or similar)
  • Strong understanding of CI/CD principles and platforms (Jenkins, CircleCI , GitLab, GitHub Actions, Argo, Travis CI, etc.)
  • Expertise in configuration management tools (Ansible, Puppet, Chef)
  • Experience with infrastructure as code (Terraform, CloudFormation)
  • In-depth understanding of networking, security, and system administration of Linux operating systems
  • Knowledge of version control platforms and branching strategies
  • Knowledge of disaster recovery planning, backup strategies, and data replication
  • Experience supporting large Federal programs ($200M )

#ECS1

ECS is an equal opportunity employer and does not discriminate or allow discrimination on the basis any characteristic protected by law. All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, or local jurisdiction law.

ECS is a leading mid-sized provider of technology services to the United States Federal Government. We are focused on people, values and purpose. Every day, our 3300 employees focus on providing their technical talent to support the Federal Agencies and Departments of the US Government to serve, protect and defend the American People.

Salary.com Estimation for Senior Site Reliability Engineer in Fairfax, VA
$111,584 to $131,220
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice Fargo, ND
  • Design, installation, integration, and maintenance of HVAC, Controls, & Security Solutions This Jobot Job is hosted by: Courtney Hoogervorst Are you a fit?... more
  • 2 Days Ago

  • Jobs via Dice Mitchell, SD
  • RESPONSIBILITIES: Kforce has a client that is seeking an Administrative Support in Mitchell, SD. Overview: We are seeking a detail-oriented and customer-fo... more
  • 2 Days Ago

  • Jobs via Dice South Burlington, VT
  • Job Description Join the Lockheed Martin Aeronautics Company as an Aircraft Field Service Engineer (FSE) on our Aeronautics Field Sustainment team! This po... more
  • 2 Days Ago

  • Jobs via Dice Alaska, AK
  • Title: Java Developer with Claims (P&C Insurance) Location: 100% remote (No California candidates and PST is not preferred) Duration: 12 Months Contract In... more
  • 2 Days Ago


Not the job you're looking for? Here are some other Senior Site Reliability Engineer jobs in the Fairfax, VA area that may be a better fit.

  • BetterUp Arlington, VA
  • Let’s face it, a company whose mission is human transformation better have some fresh thinking about the employer/employee relationship. We do. We can’t cr... more
  • 2 Days Ago

  • Multi Media, LLC Washington, DC
  • About Multi Media, LLC Multi Media, LLC is the company behind Chaturbate, one of the most heavily trafficked live streaming platforms in the world. We supp... more
  • 12 Days Ago

AI Assistant is available now!

Feel free to start your new journey!