What are the responsibilities and job description for the Reliability Engineer (Mid-Level) position at Jobs via Dice?
JOB TITLE: Reliability Engineer (Mid-Level)
JOB LOCATION: Travel to Hanscom Air Force Base, MA or Huntsville AL occasionally
WAGE RANGE*: $120k to $125k
JOB NUMBER: 26-00598
REQUIRED EXPERIENCE:
Active Secret security clearance required
DoD 8570 / 8140 compliant certification (IAT Level II required)
One or more cloud certifications (AWS, Azure, Google Cloud Platform, or OCI)
U.S. Citizenship required
Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
Minimum 8 years of experience in cloud engineering, systems engineering, or reliability engineering
Experience supporting cloud-based systems and distributed environments
Strong understanding of system monitoring, performance tuning, and incident response
Job Description
We are seeking a Reliability Engineer to support cloud platform stability, performance, and operational resilience within a federal program.
This role focuses on ensuring the availability and reliability of cloud-based systems through proactive monitoring, incident response, and performance optimization. The Reliability Engineer will support production environments, improve system resilience, and help drive operational excellence across distributed cloud services.
Key Responsibilities
JOB LOCATION: Travel to Hanscom Air Force Base, MA or Huntsville AL occasionally
WAGE RANGE*: $120k to $125k
JOB NUMBER: 26-00598
REQUIRED EXPERIENCE:
Active Secret security clearance required
DoD 8570 / 8140 compliant certification (IAT Level II required)
One or more cloud certifications (AWS, Azure, Google Cloud Platform, or OCI)
U.S. Citizenship required
Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
Minimum 8 years of experience in cloud engineering, systems engineering, or reliability engineering
Experience supporting cloud-based systems and distributed environments
Strong understanding of system monitoring, performance tuning, and incident response
Job Description
We are seeking a Reliability Engineer to support cloud platform stability, performance, and operational resilience within a federal program.
This role focuses on ensuring the availability and reliability of cloud-based systems through proactive monitoring, incident response, and performance optimization. The Reliability Engineer will support production environments, improve system resilience, and help drive operational excellence across distributed cloud services.
Key Responsibilities
- Ensure availability, performance, and reliability of cloud platforms and services
- Monitor systems and respond to incidents, outages, and performance degradation
- Develop and maintain monitoring, logging, and alerting strategies across cloud environments
- Support implementation of high availability, backup, and disaster recovery solutions
- Analyze system performance and identify areas for optimization and improvement
- Troubleshoot issues including hardware degradation, network latency, and resource constraints
- Support production readiness by validating system requirements including dependencies, diagrams, and monitoring plans
- Utilize operational metrics such as MTTR (Mean Time to Recovery) and MTTF (Mean Time to Failure) to improve system performance
- Collaborate with engineering and DevOps teams to support system integration and deployment activities
- Develop technical solutions to complex system reliability challenges
- Active Secret security clearance required
- U.S. Citizenship required
- Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
- Minimum 8 years of experience in cloud engineering, systems engineering, or reliability engineering
- Experience supporting cloud-based systems and distributed environments
- Strong understanding of system monitoring, performance tuning, and incident response
- DoD 8570 / 8140 compliant certification (IAT Level II required)
- One or more cloud certifications (AWS, Azure, Google Cloud Platform, or OCI)
- While an hourly range is posted for this position, an eventual hourly rate is determined by a comprehensive salary analysis which considers multiple factors including but not limited to: job-related knowledge, skills and qualifications, education and experience as compared to others in the organization doing substantially similar work, if applicable, and market and business considerations. Benefits offered include medical, dental and vision benefits; dependent care flexible spending account; 401(k) plan; voluntary life/short term disability/whole life/term life/accident and critical illness coverage; employee assistance program; sick leave in accordance with regulation. Benefits may be subject to generally applicable eligibility, waiting period, contribution, and other requirements and conditions. Benefits offered are in accordance with applicable federal, state, and local laws and subject to change at TCM's discretion.
Salary : $120,000 - $125,000