What are the responsibilities and job description for the Sr. Site Reliability Engineer position at Sriven Systems Inc.?
Job Title: Sr. Site Reliability Engineer
Location: Dallas, TX or Denver, CO or San Francisco, CA
Onsite: 3x days/week
Travel: Max 10%
Job Description
As a Site Reliability Engineer in the Core Services team, you will play a key role in ensuring the reliability, scalability, and performance of Our client's Backbone Network including hardware, software and our toolset used to configure/monitor the environment while adhering to DevOps best practices.
Responsibilities
- Ensure reliability, scalability, and performance of our application and networking platforms, APIs, and integrations.
- Collaborate with development team to design resilient hosting architectures and streamline CI/CD pipelines.
- Implement observability/monitoring best practices: transaction tracing, logging, metrics, and alerting apps and network devices.
- Troubleshoot incidents (VM/hosting issues, API failures, authentication, monitoring/alerting failures).
- Promote DevOps culture to improve release velocity and reduce failures.
Minimum Qualifications
- 3 years of experience in Site Reliability Engineering, DevOps, Software Engineering, or a related field.
- Proficiency in scripting and programming (Python, Bash, Go, Java, or similar).
- Hands on experience with CI/CD tools (Jenkins, GitHub Actions, GitLab, etc.).
- Hands on experience with monitoring and logging tools (e.g., Splunk, AppDynamics, Datadog, Prometheus, Grafana
Preferred Qualifications
- Proficient in Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc..) and troubleshooting skills in large scale environments.
- Hands on Experience with cloud platforms (AWS, Google Cloud Platform) and container technologies (Docker, Kubernetes).
- Design, implement, and maintain CI/CD pipelines for Teamcenter environments using tools such as Deployment Center, Jenkins, or similar.
- Hands-on experience in one or more databases (Relational / NoSQL like Oracle, MongoDB)
- Exposure to event-driven architectures and messaging systems (Kafka, RabbitMQ).
- Experience with security and authentication (OAuth, SAML, SSO).
- Bachelor's degree or Master's degree in Computer Science or equivalent years of work experience