What are the responsibilities and job description for the Site Reliability Engineer position at Broadridge Financial Solutions?
At Broadridge, we've built a culture where the highest goal is to empower others to accomplish more. If you're passionate about developing your career, while helping others along the way, come and help us achieve our goals.
Broadridge is growing, and we are seeking a Site Reliability Engineer to join our team. We are looking for someone responsible for designing, implementing, and operating the technical infrastructure for strategic Broadridge applications and services. We expect you to craft technical solutions and automated operational processes for next-generation Broadridge platforms for automation, availability, performance, scale, and cost. You will ensure the continuous health and reliability of the technical infrastructure and the applications/services it supports through Service Level Objectives (SLOs), metrics/Service Level Indicators (SLIs) and continuous operational process improvements.
Responsibilities:
- Works within and across teams to craft, develop, test, implement, and support technical solutions across a full-stack of development tools and technologies.
- Translates business requirements into technical designs, considering automation, availability, performance, scale, and cost.
- Ensures technical & security practices along with Broadridge standards are adhered to in the design of technical infrastructure.
- Participates in technical design sessions and works closely with multiple teams, including application development teams, infrastructure teams, vendors, and clients, if needed, to review the infrastructure designs for new projects.
- Delivers high-quality technical infrastructure, on-time, following Broadridge processes.
- Provides estimates of all priority and non-priority projects along with recommended scope or schedule changes based on capacity and unforeseen challenges.
- Participates in technical implementation to ensure the quality of the infrastructure, automation, and the overall efficiency of the SRE (Service Reliability Engineering) team.
- Tracks Service Level Indicators (SLI) to ensure the health of technical infrastructure and Broadridge services.
- Fixes production issues affecting Broadridge services as needed, taking appropriate corrective actions.
- Conducts preventative maintenance to ensure capacity, scaling, security, and availability of Broadridge services.
- Understands dependencies between infrastructure components, vendor software, custom software, and other parts of the processing stack that support Broadridge Services.
- Collaborates with peers and other technical teams, such as development teams, architecture, database teams, storage teams, server teams, security teams to prevent and shorten production incidents.
- Defines Service Level Objectives (SLOs) for Broadridge Services.
- Implements additional operational improvements for automation, monitoring, and incident management to increase the reliability of Broadridge services.
- Guides more junior associates through established processes.