Demo

Lead Site Reliability Engineer

EPAM Systems
New York, NY Full Time
POSTED ON 3/12/2026
AVAILABLE BEFORE 5/12/2026
Lead Site Reliability Engineer Remote in United States of America: New York Site Reliability Engineering Join our team as a Lead Site Reliability Engineer to drive system reliability, observability, and performance monitoring for mission-critical digital trading products. You will lead monitoring initiatives in a high-availability trading environment, ensuring stable connectivity to external partners while proactively identifying opportunities for continuous improvement. At EPAM, you'll work on cutting-edge technologies, solve complex challenges, and shape the future of digital innovation. With access to continuous learning, mentorship, and global projects, your expertise will drive meaningful change. Req# 968473077 Responsibilities Define and implement a strategic reliability vision for the trading portfolio, covering infrastructure, network connectivity, application performance, and throughput Lead and oversee a team of SRE engineers, providing technical direction, mentorship, and performance guidance Own and evolve the SLA/SLO/SLI framework, including error budgets and service health reporting Configure and optimize comprehensive monitoring and alerting systems across infrastructure and applications Drive observability best practices using APM and monitoring platforms (e.g., Dynatrace) Analyze application and infrastructure performance to isolate fault domains and determine root causes of critical incidents Lead major incident management, coordinate resolution efforts, and conduct blameless postmortems Participate in 24x7x365 support rotation and ensure operational excellence across the team Identify automation opportunities to improve reliability, scalability, and operational efficiency Requirements 8+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering Proven leadership experience (technical lead or team lead), with ability to oversee and mentor engineers Strong hands-on experience with SLA/SLO/SLI definition, governance, and reporting Solid experience working in Microsoft Azure environments (IaaS, PaaS, networking, monitoring) Hands-on experience with Dynatrace (configuration, alerting, dashboards, performance analysis) Experience with observability, monitoring, and APM tools in production environments Ability to operate effectively under pressure in time-sensitive, high-impact environments We offer/Benefits Medical, Dental and Vision Insurance (Subsidized) Health Savings Account Flexible Spending Accounts (Healthcare, Dependent Care, Commuter) Short-Term and Long-Term Disability (Company Provided) Life and AD&D Insurance (Company Provided) Employee Assistance Program Unlimited access to LinkedIn learning solutions Matched 401(k) Retirement Savings Plan Paid Time Off – the employee will be eligible to accrue 15-25 paid days, depending on specific level and tenure with EPAM (accrual eligibility may change over time) Paid Holidays - nine (9) total per year Legal Plan and Identity Theft Protection Accident Insurance Employee Discounts Pet Insurance Employee Stock Purchase Program If otherwise eligible, participation in the discretionary annual bonus program If otherwise eligible and hired into a qualifying level, participation in the discretionary Long-Term Incentive (LTI) Program For remote work in New York City only. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our clients, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential. Engineer the Future with a Career at EPAM This posting includes a good faith range of the salary EPAM would reasonably expect to pay the selected candidate. The range provided reflects base salary only. Individual compensation offers within the range are based on a variety of factors, including, but not limited to: geographic location, experience, credentials, education, training; the demand for the role; and overall business and labor market considerations. Most candidates are hired at a salary within the range disclosed. Salary range: $140,000 - $155,000. In addition, the details highlighted in this job posting above are a general description of all other expected benefits and compensation for the position. Applications will be accepted on a rolling basis. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. EPAM will not provide new H-1B visa sponsorship for this position. Candidates with existing transferable H-1B status may be considered.

Salary : $140,000 - $155,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Lead Site Reliability Engineer?

Sign up to receive alerts about other jobs on the Lead Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$137,568 - $176,908
Income Estimation: 
$158,960 - $205,707
Income Estimation: 
$140,435 - $166,410
Income Estimation: 
$151,875 - $212,356
Income Estimation: 
$169,957 - $202,398
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at EPAM Systems

  • EPAM Systems Atlanta, GA
  • We’re looking for a Server-side Engineer to help power a Mobile Backend for Frontend (BFF) layer used by tens of millions of users worldwide. This role is ... more
  • 10 Days Ago

  • EPAM Systems Mountain View, CA
  • We are seeking a dynamic Site Reliability Engineer (SRE) / Software Engineer (SWE) to join our innovative team. In this hybrid role, you will maintain and ... more
  • 10 Days Ago

  • EPAM Systems Boston, MA
  • You are strategic, resilient, engaging with people, and a natural self-starter. You are competitive. You have a passion for hunting, building trusting rela... more
  • 11 Days Ago

  • EPAM Systems Philadelphia, PA
  • You are strategic, resilient, engaging with people, and a natural self-starter. You have a passion for solving complex problems. Years of expertise in a co... more
  • 11 Days Ago


Not the job you're looking for? Here are some other Lead Site Reliability Engineer jobs in the New York, NY area that may be a better fit.

  • JPMorgan Chase York, NY
  • As a Site Reliability Engineering at JPMorgan Chase within the Enterprise technology, liquidity risk team, you are the non-functional requirement owner and... more
  • 18 Days Ago

  • Morgan Stanley York, NY
  • Senior Site Reliability Engineer, VP, P5 At Morgan Stanley, we advise, originate, trade, manage and distribute capital for governments, institutions and in... more
  • 20 Days Ago

AI Assistant is available now!

Feel free to start your new journey!