What are the responsibilities and job description for the Staff Site Reliability Engineer position at Recruiting From Scratch?
Company Stage: Public (Digital-first, high-growth)
Office Type: Remote-first with periodic onsite collaboration
Salary Range: $163,600–$226,400 bonus equity
About the Company
Upstart is the leading AI lending marketplace partnering with banks and credit unions to expand access to affordable credit. Their AI-driven platform enables higher approval rates with lower loss rates across demographic groups, creating a modern, digital-first borrowing experience. Over 80% of borrowers are instantly approved with zero documentation.
A digital-first company, Upstart operates remotely across the U.S., with hub offices in San Mateo, Columbus, and Austin. Most teammates join because they believe in the mission: enabling effortless credit based on true risk.
About the Team
Upstart’s Site Reliability Engineering (SRE) team owns reliability, resiliency, and observability across all production systems. The team builds tooling, automation, and platforms that keep infrastructure healthy, accelerate engineering velocity, and support world-class uptime for millions of customers.
SRE at Upstart operates like a product function:
-
They conduct customer interviews (internal engineering teams)
-
Use data to drive decisions
-
Focus on eliminating toil and improving developer experience
-
Influence company-wide infrastructure and reliability strategy
This role specifically sits on the Site Reliability Tooling group — focusing on building internal tools, observability frameworks, deployment automation, and reliability standards that power Upstart engineering.
What You’ll Do
-
Champion and embody modern SRE principles across Upstart.
-
Design and implement reliability tooling that improves visibility, reduces operational burden, and increases system uptime.
-
Establish standards for monitoring microservices, mobile apps, ML platforms, databases, and Kubernetes clusters.
-
Improve incident response and foster a culture of ownership, learning, and high-quality operations.
-
Build internal tools and automation from scratch, leveraging strong software engineering fundamentals.
-
Reduce toil through intelligent automation and elimination of manual processes.
-
Drive architectural discussions, system design, and long-term reliability strategies.
-
Collaborate with engineering teams to ensure reliability practices are easy to adopt and deliver measurable impact.
-
Treat production issues and support requests as requirements gathering opportunities for more intuitive, automated tooling.
What We’re Looking For
-
6 years combined experience across Software Engineering, SRE, or DevOps roles.
-
Strong proficiency coding in Python, Go, or TypeScript/Javascript.
-
Experience designing and building internal tools from scratch.
-
Expertise operating cloud-native, microservice environments in production.
-
Strong knowledge of IaC tools (Terraform, CDK, CloudFormation).
-
Hands-on Kubernetes experience (architecture, operations, debugging).
-
Experience operating complex observability stacks (Datadog, Prometheus, etc.)
-
Exposure to CI/CD tooling and TDD/agile engineering practices.
-
Experience participating in on-call rotations and incident management.
-
Excellent system architecture and design skills.
-
Strong data-driven mindset (metrics, dashboards, alerting optimization).
-
Full-stack software development experience.
-
Experience building high-volume, high-uptime CI/CD systems.
-
Linux systems engineering and automation expertise.
-
Experience with ArgoCD, Artifactory, Backstage, or similar tooling.
-
Exposure to service mesh architectures.
-
Experience applying LLMs or GenAI to reliability engineering problems.
Why You’ll Love It
-
Competitive compensation (base bonus equity)
-
100% matched 401(k) contributions up to $4,500
-
Comprehensive medical, dental, and vision HSA contributions
-
Employee Stock Purchase Plan (ESPP)
-
Generous vacation, sick leave, and holidays
-
Paid parental, family care, and military leave
-
Quarterly onsite team collaboration (travel covered)
-
Wellness, tech, and ergonomic reimbursements
-
Strong culture of learning, ownership, and engineering excellence
Salary : $63,600 - $226,400