What are the responsibilities and job description for the Site Reliability Engineer position at Brooksource?
Entry Level Site Reliability Engineer
Onsite: Columbus, OH - 4 days a week
Compensation: $28/hour during the 6 month Elevate Program
Conversion: $65,000-$75,000
** All recent Graduates are welcome to apply!
Brief Description:
The Site Reliability Engineer for the Commercial, Consumer, and Enterprise Payments Digital Platform teams will work as part of the larger team to support the reliability, availability, and scalability of critical systems and infrastructure.
Detailed Description:
This role is an entry level position requiring the individual to be driven, well-organized and curious. They must be self-motivated and thrive in a collaborative, fast-paced environment. The team will look to this individual to work closely with development, operations, and support teams to ensure seamless deployments, monitor system health, automate processes, and resolve production issues.
Primary Responsibilities:
- Support the monitoring of performance and health of production, systems and services.
- Respond to incidents, resolve technical issues, and escalate when necessary.
- Participate in on-call rotations to handle urgent issues and ensure uptime.
- Assist in automating manual tasks and operational processes to improve system reliability and efficiency.
- Assist in developing scripts and tooling for system monitoring, alerting, and automation using tools like Bash, Python, or other languages.
- Assist in identifying and addressing performance bottlenecks.
- Collaborate with senior engineers to analyze system performance data and make recommendations for improvements.
- Assist in identifying and addressing performance bottlenecks.
- Collaborate with senior engineers to analyze system performance data and make recommendations for improvements.
- Support cloud infrastructure (e.g., AWS, GCP, Azure) or on-premises servers to ensure they are robust and secure.
- Assist in maintaining infrastructure as code using tools like Terraform, Ansible, or Kubernetes.
- Work closely with software development teams to deploy and support new features and products.
- Learn and provide feedback on design and infrastructure decisions to ensure systems are reliable and scalable.
- Learn and identify areas for process improvement and automation opportunities within the production environment.
- Help document best practices and maintain runbooks and technical documentation.
- Assist in capacity planning by monitoring system performance trends and helping to predict future needs.
- Assist in ensuring systems can scale appropriately to handle increasing loads and traffic.
- Participate in team meetings, Stand-Ups, and organized sessions.
- Build knowledge around technology tools, and methodologies in site reliability engineering.
- Perform other duties as assigned.
Job Requirements:
Minimum Requirements:
- 0 to 2 years of experience with a bachelor’s degree in a software development, test automation development, or engineering role or equivalent transferable experience through coursework, internships, or work experience.
Skills:
- Basic knowledge of or experience with Linux/Unix operating systems and system administration tasks.
- Experience with at least one scripting language (e.g., Python, Bash, Go, etc.).
- Basic understanding of or experience with cloud infrastructure and services (AWS, GCP, or Azure).
- Familiarity or experience with monitoring and observability tools such as Dynatrace, Prometheus, Grafana, or Datadog.
- Exposure to or understanding of CI/CD pipelines and related tools such as Jenkins, GitLab CI, etc.
- Basic understanding of or experience with networking fundamentals (DNS, TCP/IP, HTTP).
- Strong experience with MS PowerPoint, Excel, and Word.
- Strong aptitude with demonstrated curiosity and willingness to learn business and technology processes and solutions.
- Ability to manage tasks effectively, meet deadlines and stay organized with minimal supervision.
- Ability to communicate clearly and succinctly with team members and stakeholders when sharing updates and presenting work in team settings.
- Demonstrates basic active listening skills, paraphrases or summarizes key points from conversations, and asks clarifying questions to ensure understanding. Listens without interrupting and shows respect for others' viewpoints.
- Ability to build positive working relationships with colleagues and contribute to a collaborative team environment.
- Actively participates in team activities, supports shared goals, and contributes to a positive team dynamic.
- Able to communicate needs effectively and work toward mutually beneficial outcomes in team discussions and task planning.
- Able to embrace change, learn new tools and processes quickly, and respond positively to feedback.
- Able to make sound decisions within their scope of responsibility and seek input when faced with unfamiliar challenges.
- Able to apply logical thinking to resolve issues and seeks support when needed to overcome challenges.
- Asks thoughtful questions, analyzes information, and approaches tasks with curiosity and a solution-oriented mindset.
Preferred Requirements:
- Familiarity with containerization and orchestration tools (Docker, Kubernetes).
- Basic knowledge of configuration management tools like Ansible, Chef, or Puppet.
- Exposure to DevOps methodologies and tools for infrastructure automation.
Salary : $28