What are the responsibilities and job description for the Lead Engineer/SRE, KMS - AdTech Leader position at Andiamo?

Lead Site Reliability Engineer

This position offers the opportunity to guide the reliability and performance of large scale, customer facing systems. You will help create the services, automation, and architectural patterns that allow engineering teams to move quickly with confidence. The work focuses on treating operations as a software problem, building systems that are resilient by design, and partnering with product teams to ensure they can deliver reliable features at speed.

About The Role

You will take ownership of key reliability initiatives, shaping the technical vision for the systems under your care. Your work will support the continuous evolution of backend services and development workflows, helping teams release and operate their software smoothly. This role is ideal for someone who enjoys complex distributed systems, performance engineering, and building tools that empower large engineering groups.

How You Will Make a Difference

Deliver foundational services that support rapid and predictable software delivery across the engineering organization.
Create systems and operational processes that support reliable and scalable applications.
Identify upstream solutions that prevent recurring issues and promote long term stability.
Develop the technical roadmap for your area, collaborating with stakeholders to solve meaningful engineering challenges.
Improve throughput and system performance by analyzing and eliminating architectural bottlenecks.
Work with tools and technologies such as Python, AWS, Django, Kubernetes, Bash, Terraform, MySQL, Redis, and Postgres.
Help foster a culture of strong engineering practices through thoughtful design discussions and collaborative whiteboarding sessions.
Support and mentor engineers across the company, helping raise the standard of engineering quality and operational excellence.
Write and maintain software that improves the reliability, performance, and efficiency of platform services.
Participate in on call rotations with a focus on resolving issues at the source and reducing alert fatigue.
Introduce architectural changes that significantly improve the scalability and resilience of critical systems.
Work closely with product oriented engineers and other SREs to deliver improvements that have real customer impact.
Use data driven analysis to understand system behavior, predict scaling needs, and guide strategic improvements.
Promote site reliability principles across the engineering organization.

Who You Are

Ten or more years of experience in site reliability engineering, devops, or related fields.
Degree in computer science or a related field, or equivalent hands on experience.
Calm and focused during outages with the ability to drive investigations to clear root cause and long term corrective measures.
Strong understanding of Linux systems and the full networking stack.
Experience collaborating with engineering teams to build and operate production software.
Proficiency writing code using best practices in languages such as Python, Ruby, or Go.
Genuine interest in exploring emerging AI tools and responsibly experimenting with techniques that improve engineering workflows.

This role is well suited for someone who enjoys solving reliability challenges at scale, improving platform performance, and building systems that help engineers ship better software with greater confidence.

About Andiamo

Talent Partners for the AI Revolution. As a globally recognized staffing and consulting firm, we specialize in placing the top 2% of technology and go-to-market professionals with the world’s largest and most well-known companies.

For over 20 years, we've maintained the status of tier-one vendor for firms such as Palantir, Amazon, Fluidstack, Bloomberg, Relativity Space, Firefly, MasterCard, Visa, Two Sigma, Citadel, as well as other major financial services firms, elite hedge funds, Google-backed tech start-ups, and major software firms.

Our talent solutions include Permanent Placement, Contract Staffing, Executive Search, and Dedicated Recruiting Services (RPO). Find out more at www.andiamogo.com

Apply for this job

Receive alerts for other Lead Engineer/SRE, KMS - AdTech Leader job openings

Lead Engineer/SRE, KMS - AdTech Leader

What are the responsibilities and job description for the Lead Engineer/SRE, KMS - AdTech Leader position at Andiamo?

Job openings at Andiamo

Not the job you're looking for? Here are some other Lead Engineer/SRE, KMS - AdTech Leader jobs in the Boston, MA area that may be a better fit.

We don't have any other Lead Engineer/SRE, KMS - AdTech Leader jobs in the Boston, MA area right now.

AI Assistant is available now!