What are the responsibilities and job description for the Sr Advanced Software Engineer - (DevOps, SRE & AI) position at Honeywell?
We are seeking a highly skilled and experienced Senior Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in DevOps and SRE practices, with at least 5 years of hands-on experience in designing, implementing, and maintaining scalable, reliable, and secure infrastructure for cloud-native applications.
You will report directly to the Sr Software Engineering Manager and work out of our Atlanta, GA location on a hybrid work schedule. For the first 90 days, New Hires must be prepared to work 100% onsite M-F.
KEY RESPONSIBILITIES
- Design, build, and maintain scalable infrastructure on cloud platforms (GCP, AWS, Azure).
- Develop and implement CI/CD pipelines for automated deployment and testing.
- Monitor, troubleshoot, and optimize system performance, reliability, and availability.
- Lead incident response, root cause analysis, and post-mortem reviews.
- Implement and manage infrastructure as code (IaC) using tools such as Terraform, Ansible, or CloudFormation.
- Develop and maintain observability solutions (monitoring, logging, alerting) using tools like Prometheus, Grafana, ELK, Datadog, etc.
- Collaborate with development teams to ensure best practices in application reliability, scalability, and security.
- Automate operational tasks and improve system efficiency through scripting and tooling.
- Mentor and guide junior engineers in SRE and DevOps practices.
- Ensure compliance with security standards and participate in audits.