Fulcrum Digital is Hiring a Sr System Reliability Engineer (Application Support) Near Saint Louis, MO
Job Description Who are we Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, health care, and manufacturing. The Role
Provide L2 support to production system like application, database, middleware components, infrastructure and network components
Manage productions incidents end-to-end within defined SLAs with focus on resolution rather than who caused it.
Interact with various stake holders such as Release managers, program leads, service managers, development and test leads
Review operational readiness requirements such as monitoring and alerting, log rotation and resilience of the components and report the gaps
Provide pre-implementation support with activities such as release notes review and implementation dry runs.
Protect production components by running health checks, monitoring latency and memory utilization.
Automate day-to day activities and propose changes that improve reliability
Participate in CAB and provide feedback on change requests
Support the DevOps team in testing the promote pipelines and suggest automation of configuration items.
Practice incident management best practices and perform RCA.
Participate in disaster recovery tests and operational acceptance tests
Analyze the technology stack that makes up the product and optimize recovery time objective.
Work with team members spread across and time zones
Share knowledge, document improvements and mentor junior resources