Search By
223 Jobs
Site Reliability Engineer Bank of America Addison, TX | Full Time
10 Days Ago
Site Reliability Engineer Reveal-Brainspace Chicago, IL | Full Time
17 Days Ago
Associate Site Reliability Engineer Gearbox Software Remote/Frisco, TX | Full Time | Remote
1 Month Ago
Senior Site Reliability Engineer Life360 San Francisco, CA | Full Time | Remote
1 Month Ago
Site Reliability Engineer Compliance & Risks Remote, FL | Other | Remote
3 Days Ago
Site Reliability Engineer Sezzle Minneapolis, MN | Full Time | Remote
4 Days Ago
Site Reliability Engineer Xpanxion St. Louis, MO | Full Time
4 Days Ago
Senior Site Reliability Engineer Spotnana Technology Amsterdam, NH | Full Time
4 Days Ago
Site Reliability Engineer PEX New York, NY | Full Time | Remote
4 Days Ago
Site Reliability Engineer VetCentric Austin, TX | Full Time
4 Days Ago
Site Reliability Engineer fabric Seattle, WA | Full Time
4 Days Ago
Site Reliability Engineer II Bank of America Jersey City, NJ | Full Time
5 Days Ago
Senior Site Reliability Engineer Cowbell Cyber Inc. Pleasanton, CA | Full Time
6 Days Ago
Site Reliability Engineer Red Hat Software Remote, NC | Other
6 Days Ago
Associate - Site Reliability Engineer Charles Schwab Inc. Westlake, TX | Full Time
6 Days Ago
Site Reliability Engineer Cover Genius Canada, KY | Full Time
7 Days Ago
Site Reliability Engineer VGW Perth, WA | Full Time
7 Days Ago
Senior Site Reliability Engineer ProbablyMonsters Bellevue, WA | Full Time
8 Days Ago
Site Reliability Engineer 365 Retail Markets Troy, MI | Full Time | Remote
9 Days Ago
Site Reliability Engineer, Americas Canonical - Jobs Santiago, MN | Full Time
10 Days Ago
Site Reliability Engineer, Americas Canonical - Jobs Montevideo, MN | Full Time
10 Days Ago
Site Reliability Engineer Bank of America Charlotte, NC | Full Time
10 Days Ago
Site Reliability Engineer Arrive Logistics Austin, TX | Full Time
10 Days Ago
Site Reliability Engineer Bank of America Richmond, VA | Full Time
10 Days Ago
Site Reliability Engineer, Americas Canonical - Jobs Vancouver, WA | Full Time
10 Days Ago
More Search Results

Site Reliability Engineer

Addison, TX | Full Time
10 Days Ago

Job Description

Job Description:

Come join an exciting team within Global Information Security (GIS). Cyber Security Technology (CST) is a globally distributed team responsible for cyber security innovation and architecture, engineering, solutions and capabilities development, cyber resiliency, access management engineering, data strategy, deployment maintenance, technical project management and information technology security control support.

The individual in this role will partner directly with Software Engineering, Core Tech Infrastructure (CTI) Engineering, and Production Services teams to improve reliability and observability for the services they support by planning and implementing any instrumentation, tooling, ticketing, alerting and on-call routines defined in observability designs. This individual typically supports services with less strenuous reliability requirements as they learn Site Reliability Engineer (SRE) standards and practices. This role will engage in production triage efforts and Problem Management routines, using the experiences to continue to grow their SRE knowledge and to start identifying potential gaps in the observability design or implementation. This individual will also focus heavily on software development activities, with a focus toward delivering automated solutions to eliminate operational ‘toil’ and suggesting code enhancements to software engineering teams to help improve the reliability or observability of the service.

Key Responsibilities:

  • Leverage guidance from SRE II and Sr. SRE resources to establish effective monitoring/observability solutions.
  • Work with monitoring tools and Application Development teams to enhance monitoring capabilities and modify monitoring dashboards for new observability plans created in support of initiatives or continuous improvement efforts.
  • Develop software or system scripts to simplify or eliminate the dependence on human intervention for recurring tasks.
  • Contribute to a catalog of extensible reliability tools and libraries that can be leveraged for common instrumentation, automation, and operational needs by both Application Production Services (APS) and Application Development teams.
  • Partner with solutions engineers and application teams to implement the necessary code changes to make use of common reliability libraries and tools and help Production Support and Application Development teammates understand how to use them.
  • Engage as a subject matter expert (SME) in Incident triage efforts, failure scenario modelling and work with Problem Manager to diagnose root causes for incident / problem management investigations.
  • Work with Production Support teams to perform knowledge transfer, playbook updates and training for new monitoring capabilities.
  • Identify vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring and to help define solutions to reduce manual support effort and/or improve system reliability.
  • Participate regularly in an on-call rotation with Production Support teammates to learn more about reliability issues affecting their portfolio.

Required Skills:

  • Previous Systems Engineering and Deployment experience
  • Previous Application Management and Support experience
  • Understanding of software and/or application lifecycle and the implementation of security principals throughout
  • Understanding of complex environments, their sub-components, concepts, and interactions
  • Experience with databases and associated query languages
  • Experience in security vulnerability remediation
  • Deep understanding of large networks and systems and the interaction between applications, infrastructures, etc.
  • Experience in scripting/automation languages
  • Proficient in Windows and Linux server support and access systems
  • A broad knowledge of information security principles
  • Ability to work independently on initiatives with little oversight
  • Strong analytical skills/problem solving/conceptual thinking; out-of-the-box thinkers
  • Ability to identify, analyze, and address problems to resolve issues in a way that minimizes negative impact and risk to the organization
  • Ability to be comfortable delivering messages across a wide spectrum of individuals having varying degrees of technical understanding
  • Strong leadership skills and qualities which enable you to work with peers and various levels of management
  • Effective oral and written communication skills
  • Highly motivated, self-driven individual with a desire for growth through learning

Enterprise Role Overview:

The SRE will partner directly with Software Engineering, CTI Engineering, and Production Services teams to improve reliability and observability for the services they support by planning and implementing any instrumentation, tooling, ticketing, alerting and on-call routines defined in observability designs. They typically support services with less strenuous reliability requirements as they learn SRE standards and practices. SREs will engage in production triage efforts and Problem Management routines, using the experiences to continue to grow their SRE knowledge and to start identifying potential gaps in the observability design or implementation. The SRE will also focus heavily on software development activities, with a focus toward delivering automated solutions to eliminate operational ‘toil’ and suggesting code enhancements to software engineering teams to help improve the reliability or observability of the service.

Shift:

1st shift (United States of America)

Hours Per Week: 

40

Skills for Site Reliability Engineer

The job skills required for Site Reliability Engineer include Software Engineering, Problem Solving, Initiative, Software Development, Leadership, Written Communication etc. Having related job skills and expertise will give you an advantage when applying to be a Site Reliability Engineer. That makes you unique and can impact how much salary you can get paid. Below are job openings related to skills required by Site Reliability Engineer. Select any job title you are interested in and start to search job requirements.

Job Openings with Skill of Software Engineering
MORE>>
Job Openings with Skill of Problem Solving
MORE>>
Job Openings with Skill of Initiative
MORE>>
Job Openings with Skill of Software Development
MORE>>
Job Openings with Skill of Leadership
MORE>>

Career Path for Site Reliability Engineer

The following is the career advancement route for Site Reliability Engineer positions, which can be used as a reference in future career path planning. As a Site Reliability Engineer, it can be promoted into senior positions as a Corrosion Engineer III that are expected to handle more key tasks, people in this role will get a higher salary paid than an ordinary Site Reliability Engineer. You can explore the career advancement for a Site Reliability Engineer below and select your interested title to get hiring information.

How to Become a Site Reliability Engineer

If you are interested in becoming a Site Reliability Engineer, you need to understand the job requirements and the detailed related responsibilities. Of course, a good educational background and an applicable major will also help in job hunting. Below are some tips on how to become a Site Reliability Engineer for your reference.

Step 1 Understand the job description and responsibilities of a Site Reliability Engineer

Quotes from people on Site Reliability Engineer job description and responsibilities
03/16/2022: Little Rock, AR
01/31/2022: Lexington, KY
02/24/2022: Tuscaloosa, AL
02/25/2022: Manchester, NH

Step 2 Knowing the best tips for becoming a Site Reliability Engineer can help you explore the needs of the position and prepare for the job-related knowledge well ahead of time.

Career tips from people on Site Reliability Engineer jobs
12/28/2021: Lima, OH

Step 3 View the best colleges and universities for Site Reliability Engineer

Butler University
Carroll College
Cooper Union
High Point University
Princeton University
Providence College