What are the responsibilities and job description for the Architect, Site Reliability Engineering position at Vertafore Career Center?
$185,000 - $225,000 / year Bonus
Vertafore is a leading technology company whose innovative software solutions are advancing the insurance industry. Our suite of products provides solutions to our customers that help them better manage their business, boost their productivity and efficiencies, and lower costs while strengthening relationships.
Our mission is to move InsurTech forward by putting people at the heart of the industry. We are leading the way with product innovation, technology partnerships, and focusing on customer success.
Our fast-paced and collaborative environment inspires us to create, think, and challenge each other in ways that make our solutions and our teams better.
We are headquartered in Denver, Colorado, with offices across the U.S., Canada, and India.
The SRE Architect is responsible for the technical vision and long-term architectural strategy for reliability across Vertafore’s global product portfolio. You will design the cross-cutting systems, automation frameworks, and observability standards that allow our engineering teams to scale without a linear increase in operational overhead. This role bridges the gap between high-level business strategy and deep technical execution, ensuring that "Reliability by Design" is baked into every layer of our AWS and hybrid infrastructure.
Key Responsibilities
Strategic Reliability Architecture
- Design for Resilience: Lead the architectural review of new and existing services to ensure they are built for high availability, fault tolerance, and global scalability.
- Platform Standardization: Define the "Golden Paths" for infrastructure and deployment, ensuring that teams use standardized, pre-approved patterns for Vertafore tech stack.
- The Four Golden Signals: Architect the global observability strategy, ensuring every product family has automated, consistent telemetry for Latency, Traffic, Errors, and Saturation.
SRE Frameworks & Governance
- Global SLO Framework: Design and oversee the organization-wide implementation of SLIs, SLOs, and Error Budgets.
- Error Budget Advocacy: Act as the ultimate technical arbiter for Error Budget policies, ensuring they are used as a mathematical contract to balance feature velocity and system stability.
- Toil Elimination at Scale: Identify systemic sources of Toil across the enterprise and architect software solutions to eliminate them globally, maintain a 50% ratio of Ops to Project work.
Advanced Automation & Self-Healing
- Autonomous Infrastructure: Lead the strategy for Infrastructure-as-Code (Terraform, CDK), AI tooling and technologies and configuration management, moving the organization toward a fully immutable infrastructure model.
- Self-Healing Design: Architect and implement advanced self-healing and auto-remediation frameworks to reduce the need for manual incident intervention.
Technical Advocacy & Culture
- Blameless Culture Leadership: Set the standard for Blameless Postmortems and lead to the analysis of the most complex, cross-functional system failures. Occasionally participate in High-Priority incidents to guide teams towards successful resolution in a timely manner.
- Mentorship: Mentor Tech Leads and Senior SREs, fostering a culture where operations are treated as a software engineering discipline.
- Cross-departmental Innovation and Collaboration: Collaborate with various departments like Product Development, Architecture and Product Owners to align reliability goals with the business roadmap and innovate product software and infrastructure design.
Salary : $185,000 - $225,000