What are the responsibilities and job description for the Senior SRE/DevOps position at Nisum?
What You'll Do:
Production Ownership & Incident Leadership
- Act as a Primary On-Call Engineer in a 24 7 production environment
- Lead response for high-severity incidents (P1/P2), including:
- Driving incident bridges
- Coordinating cross-team response
- Ensuring timely resolution and communication
- Serve as a point of accountability for production stability during assigned shifts
Infrastructure & System Reliability
- Manage and support production infrastructure across cloud and on-prem environments
- Monitor system health and proactively identify risks, bottlenecks, and failure points
- Ensure high availability, performance, and resilience of applications and services
DevOps & Automation
- Design and implement automation to improve reliability and reduce manual intervention
- Improve monitoring, alerting, and observability frameworks
- Drive initiatives to reduce incident frequency and improve recovery times
Stakeholder Engagement & Communication
- Partner directly with engineering leaders, infrastructure teams, and product stakeholders
- Provide clear, concise communication during incidents and escalations
- Translate technical issues into business impact for leadership visibility
- Build trust through reliability, responsiveness, and ownership
Operational Excellence & Continuous Improvement
- Lead or contribute to root cause analysis (RCA) and post-incident reviews
- Identify systemic issues and drive long-term fixes
- Establish and improve operational processes, runbooks, and standards
- Mentor junior engineers and support team development
What You Know:
Technical Expertise
- 7 years of experience in SRE, DevOps, or Production Engineering roles
- Strong experience in DevOps, Site Reliability Engineering (SRE), or production support environments
- Experience with monitoring and incident management tools such as:
- ServiceNow
- Dynatrace
- Grafana
- Datadog, Observe, Splunk, or similar platforms
- PagerDuty
- Solid hands-on experience with:
- Cloud platforms (AWS, Azure, or Google Cloud Platform)
- Kubernetes
- Linux/Unix and Windows systems
- Networking fundamentals
- Scripting experience using Python, Bash, or similar languages
- Programming experience in Java and/or .NET
Leadership & Behavioural Skills
- Strong ownership mindset and accountability
- Ability to lead under pressure and manage high-severity incidents
- Excellent communication skills, especially with non-technical stakeholders
- Comfortable operating in a high-visibility, leadership-facing environment
- Strong problem-solving and decision-making abilities
Work Model
- Remote work
- Participation in a rotating on-call schedule, including weekends, and holidays as needed
- Ability to respond to critical incidents outside of standard working hours when required
What Success Looks Like
- High-severity incidents are handled efficiently with strong coordination and communication
- Production systems remain stable, performant, and resilient
- Reduction in recurring incidents through proactive improvements
- Strong trust established with client leadership and stakeholders
- Continuous improvement of operational maturity and reliability practices
Education:
- Bachelor's degree in Computer Science or related field.
Benefits:
- In addition to competitive salaries and benefits packages, Nisum US offers its employees some unique and fun extras:
- Professional Development - We offer in-house technical training and professional learning programs aimed at developing skills across a broad spectrum of topics such as technology, leadership, role-based training, and process expertise. We also offer an annual stipend for employees to attend external courses in order to maintain professional certifications
- Health & Wellness Benefits - We believe that your health and welfare are important, and we strive to ensure that you have affordable options available to you, including some plans that are subsidized for employees and their families up to 90%. We also have dental and vision plans in the US where Nisum pays 100% of premiums for employees
- Volunteerism Pay - We believe in giving back and in the US, our employees are eligible for up to 40 hours of paid time off each year to volunteer towards the causes that they are most passionate about. This is in addition to personal PTO and paid holidays
- Additional Benefits - We offer all the other important benefits to keep employees and their families healthy and financially secure, such as 401(k) retirement savings with a company match, pre-tax parking and transit programs, disability insurance, and Basic Life/AD&D, alongside exclusive employee discounts on a wide variety of products and services.
Compensation Band:
$120 - $130K per annum
Salary : $120,000 - $130,000