You haven't searched anything yet.
We are seeking an experienced Site Reliability Engineer (SRE) to join our team. The ideal
candidate will have a strong background in cloud technologies (specifically in Azure), and a
deep understanding of SRE practices. As a key team member, you will help ensure the
reliability, scalability, and performance of customer systems. This role will contribute to the
automation of SDLC processes to enable smooth operation.
This role will start as a part-time contract position with the possibility of converting to FTE in
the future.
Responsibilities:
- Play a crucial role in ensuring the availability, latency, performance, efficiency, and
stability of critical infrastructure, supporting a range of data platforms, applications, and
services.
- Collaborate closely with development teams to implement and maintain reliable and
scalable systems.
- Proactively monitor and identify potential issues that could impact the availability of our
systems.
- Implement and maintain automated alerting mechanisms to notify the appropriate
parties of potential outages or performance degradation.
- Collaborate with development teams to design and implement solutions that enhance
system resilience and reduce downtime.
- Optimize resource utilization and minimize unnecessary expenditure on IT
infrastructure.
- Collaborate with development teams to optimize resource allocation for new
applications and services.
- Participate in the release planning process to ensure that software releases are
conducted smoothly and without disruptions.
- Design, implement, and maintain a comprehensive monitoring infrastructure to track the
health and performance of our systems. Collaborate across broad groups within large IT
organizations to deliver results.
- Expect project-based work with multiple external customers.
Qualifications:
- Experience architecting, designing and/or implementing solutions with Azure cloud
tooling
- Experience with cloud infrastructure and tooling such as Kubernetes (AKS), Docker,
CI/CD pipelines, Pulumi, Terraform
- Ability to read and write .Net (C#) code
- Experience with CosmoDB and SQL Server
- Experience administrating Linux operating systems
- 5 years of experience in Site Reliability, debugging, diagnosing, and correcting errors
and resolving high severity incidents
- Experience configuring and managing monitoring and alerting tools on Azure cloud
infrastructure.
- strong background in networking and configuration of cloud networks
- Play a crucial role in ensuring the availability, latency, performance, efficiency, and
stability of critical infrastructure, supporting a range of data platforms, applications, and
services.
- Collaborate closely with development teams to implement and maintain reliable and
scalable systems.
- Proactively monitor and identify potential issues that could impact the availability of our
systems.
- Implement and maintain automated alerting mechanisms to notify the appropriate
parties of potential outages or performance degradation.
- Collaborate with development teams to design and implement solutions that enhance
system resilience and reduce downtime.
- Optimize resource utilization and minimize unnecessary expenditure on IT
infrastructure.
- Collaborate with development teams to optimize resource allocation for new
applications and services.
- Participate in the release planning process to ensure that software releases are
conducted smoothly and without disruptions.
- Design, implement, and maintain a comprehensive monitoring infrastructure to track the
health and performance of our systems. Collaborate across broad groups within large IT
organizations to deliver results.
- Expect project-based work with multiple external customers.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including health, an incentive and recognition program, and 401K contribution (all benefits are based on eligibility).
Applications accepted on ongoing basis until filled.
Full Time
$113k-127k (estimate)
05/24/2024
05/26/2024
The job skills required for Site Reliability Engineer (Azure) include Networking, Team Development, etc. Having related job skills and expertise will give you an advantage when applying to be a Site Reliability Engineer (Azure). That makes you unique and can impact how much salary you can get paid. Below are job openings related to skills required by Site Reliability Engineer (Azure). Select any job title you are interested in and start to search job requirements.
The following is the career advancement route for Site Reliability Engineer (Azure) positions, which can be used as a reference in future career path planning. As a Site Reliability Engineer (Azure), it can be promoted into senior positions as a Corrosion Engineer II that are expected to handle more key tasks, people in this role will get a higher salary paid than an ordinary Site Reliability Engineer (Azure). You can explore the career advancement for a Site Reliability Engineer (Azure) below and select your interested title to get hiring information.
If you are interested in becoming a Site Reliability Engineer, you need to understand the job requirements and the detailed related responsibilities. Of course, a good educational background and an applicable major will also help in job hunting. Below are some tips on how to become a Site Reliability Engineer for your reference.
Step 1: Understand the job description and responsibilities of an Accountant.
Quotes from people on Site Reliability Engineer job description and responsibilities
Similarly to the point above, a site reliability engineer can expect to spend time fixing support escalation cases.
03/16/2022: Little Rock, AR
More times than not, site reliability engineers will need to take on-call responsibilities.
01/31/2022: Lexington, KY
Focuses on the reliability of behind-the-scenes systems that help make other teams' jobs more efficient.
02/24/2022: Tuscaloosa, AL
Site reliability engineers may have to spend a considerable amount of time fixing cases related to support escalation.
02/25/2022: Manchester, NH
Step 2: Knowing the best tips for becoming an Accountant can help you explore the needs of the position and prepare for the job-related knowledge well ahead of time.
Career tips from people on Site Reliability Engineer jobs
The objective was to ensure service reliability and availability within operations management.
12/28/2021: Lima, OH
Step 3: View the best colleges and universities for Site Reliability Engineer.