Demo

Manager, Site Reliability Engineering

Massachusetts Medical Society
Waltham, MA Full Time
POSTED ON 9/19/2025 CLOSED ON 1/4/2026

What are the responsibilities and job description for the Manager, Site Reliability Engineering position at Massachusetts Medical Society?

Category

Information Technology

Job Location

860 Winter St, Waltham, Massachusetts

Tracking Code

8320

Position Type

Full-Time/Regular

The Massachusetts Medical Society (MMS) is the statewide professional association for physicians and medical students, supporting 25,000 members. We are dedicated to educating and advocating for the physicians of Massachusetts and patients locally and nationally. A leadership voice in health care, the MMS contributes physician and patient perspectives to influence health-related legislation at the state and federal levels, works in support of public health, provides expert advice on physician practice management, and addresses issues of physician well-being. Under the auspices of NEJM Group, the MMS extends our mission globally by advancing medical knowledge from research to patient care through the New England Journal of Medicine, NEJM Evidence, NEJM AI, NEJM Catalyst, NEJM Journal Watch, and through our accredited and comprehensive continuing medical education programs.

The world has changed, and so has the way we work. The MMS has adopted a flexible work model that allows most employees to choose where they work – at home, onsite in our Waltham office, or a combination of the two – based on their preferences and our business needs. Because what matters is the work we do, not where we do it.

The Manager, Site Reliability Engineering leads a team of approximately seven SREs accountable for ensuring the reliability, scalability, and cost-effectiveness of MMS platforms across a hybrid physical data center and AWS environment. This role combines people leadership with hands-on technical decision-making.

Key outcomes include clear prioritization in a high-volume, ambiguous environment; disciplined incident response and learning; and continuous reduction of toil through automation and self-service. The manager owns intake management, capacity planning, and delivery of reliability roadmaps while collaborating closely with product and other technical teams.

The successful candidate thrives in a fast-paced, dynamic environment where priorities shift quickly and complex challenges require creative solutions. They are agile, resourceful, and comfortable alternating between guiding technical problem-solving, shaping long-term architecture, and addressing urgent operational needs. Above all, they provide clarity and direction—helping the team stay focused on the right priorities and ensuring initiatives advance efficiently without unnecessary process overhead.

Responsibilities

Strategic Responsibilities:

  • Drive execution of roadmaps for observability, CI/CD, security hardening, and platform upgrades; communicating status and risks.
  • Collaborate with Cloud Architects to ensure the adoption of resilient, self-healing, scalable design patterns to support delivery and testing of our highly available multi-tier applications.
  • Establish and evolve frameworks for reliability, incident response, and continuous learning to drive operational excellence.
  • Partner with security teams to strengthen practices and embed robust guardrails aligned with industry best standards.
  • Own incident response and communications; ensure postmortems are completed with action items tracked to closure with measurable KPIs.
  • Define SLIs/SLOs and manage error budgets with partner teams; align SLAs with business impact.

People and Project Management

  • Manage, coach, and develop ~7 SREs; run 1:1s, growth plans, and performance reviews.
  • Own intake triage, backlog hygiene, and capacity planning to balance reliability, feature enablement, and operational demands.
  • Collaborate with stakeholders to define and prioritize objectives, ensuring alignment with business goals.
  • Guide team to develop self-service frameworks that empower development teams while maintaining operational standards.
  • Manage on-call program health for 24/7 support of our global sites and services: rotation design, coverage, runbooks, escalation paths, paging policy, and after-hours expectations.
  • Oversee release planning, define service level agreements, and foster the migration of legacy applications to modern CI/CD pipelines.
  • Foster a culture of collaboration, accountability, and continuous improvement.
  • Other responsibilities as assigned.

Qualifications

  • Bachelor's degree in related field with 6 years of experience in software development or DevOps, or equivalent education and experience is required.
  • 2 years directly managing SRE/DevOps teams of 5–10 engineers in a dynamic environment.
  • Hands-on expertise with hybrid cloud architectures, Linux systems (Amazon Linux) and Windows systems.
  • Strong experience in CI/CD pipeline design (GitHub Actions or Jenkins) and IaC (Terraform or CloudFormation)
  • Hands-on proficiency with observability tools (Datadog, New Relic, or Prometheus).
  • Experience implementing security best practices across compliance, vulnerability management, and identity/access management.
  • Excellent communication and project management skills; proficiency with tools such as Jira and Confluence.
  • Proven problem-solving skills, with the ability to learn quickly and adapt solutions creatively.
  • Demonstrated ability to work cooperatively and communicate effectively in an Agile team environment.
  • Self-motivated with the ability to operate independently and set priorities in ambiguous situations.
  • Experience with containerization and orchestration (Docker and Kubernetes).
  • Previous exposure to API management tool (MuleSoft preferred).
  • Experience with self-healing system design and automated failure recovery strategies.
  • Scripting proficiency in Python, Bash, or PowerShell.

Benefits

Our generous benefits offerings include: 3 weeks of paid vacation, 6 personal days, 12 sick days, 13 paid holidays, medical and dental plans, 401(k) plans with company match, backup childcare assistance, tuition assistance and more!

The MMS has earned praise as one of the Top Places to Work in Massachusetts by The Boston Globe for the past 15 years in a row! The Globe surveys employees regarding their opinions about company leadership, benefits, ethics, values and culture, and recognizes those companies who receive high marks from their employees.

The MMS is an Equal Opportunity Employer, committed to providing opportunities to veterans and people with disabilities and a work environment that is welcoming to all.

Salary.com Estimation for Manager, Site Reliability Engineering in Waltham, MA
$156,532 to $189,363
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Manager, Site Reliability Engineering?

Sign up to receive alerts about other jobs on the Manager, Site Reliability Engineering career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$156,679 - $196,968
Income Estimation: 
$222,941 - $284,552
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$137,568 - $176,908
Income Estimation: 
$154,509 - $200,187
Income Estimation: 
$188,252 - $252,911
This job has expired.
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Massachusetts Medical Society

  • Massachusetts Medical Society Waltham, MA
  • Category Membership, Medical Education, and Governance Job Location 860 Winter St, Waltham, Massachusetts Tracking Code 9992 Position Type Full-Time/Regula... more
  • 12 Days Ago

  • Massachusetts Medical Society Boston, MA
  • Category Design and Illustration Job Location 10 Shattuck Street 6th Floor, Boston, Massachusetts Tracking Code 24534 Position Type Full-Time/Regular The M... more
  • 4 Days Ago

  • Massachusetts Medical Society Waltham, MA
  • Category Information Technology Job Location 860 Winter St, Waltham, Massachusetts Tracking Code 7548 Position Type Full-Time/Regular The Massachusetts Med... more
  • 5 Days Ago

  • Massachusetts Medical Society Waltham, MA
  • Category Information Technology Job Location Waltham, Massachusetts Tracking Code 2412 Position Type Full-Time/Regular The Massachusetts Medical Society (M... more
  • 7 Days Ago


Not the job you're looking for? Here are some other Manager, Site Reliability Engineering jobs in the Waltham, MA area that may be a better fit.

  • Flywire Boston, MA
  • Company Description Are you ready to trade your job for a journey? Become a FlyMate! Passion, excitement & global collaboration are all core to what it mea... more
  • 1 Month Ago

  • LineVision Boston, MA
  • Hybrid: Boston, MA Headquarters (1-2 days/week in office) Lead the establishment of LineVision's SRE practice and shape how we deliver grid-grade reliabili... more
  • 19 Days Ago

AI Assistant is available now!

Feel free to start your new journey!