Demo

Senior Platform & Reliability Engineer (SRE)

Vizcom
San Francisco, CA Full Time
POSTED ON 6/5/2026
AVAILABLE BEFORE 7/13/2026
Agency Notice: We are not currently working with recruiting agencies for this role. Please do not contact Vizcom employees regarding this position. Any resumes submitted without a prior agreement will be considered unsolicited.

About Vizcom

Vizcom is a visual creation platform that combines modern web tooling with AI-powered workflows. Our stack includes React/TypeScript frontend, Node/Koa PostGraphile API services, PostgreSQL, Redis, BullMQ queues, and Kubernetes-based production infrastructure.

We’re hiring a senior owner of stability and infrastructure to ensure the platform is reliable, fast, and resilient as we scale.

Role Mission

Own service reliability end-to-end: prevent incidents, reduce blast radius when failures happen, and lead fast, high-quality recovery when production degrades.

This is a hands-on technical leadership role with authority to set reliability standards and enforce production guardrails.

Compensation

$200,000 – $250,000 base salary meaningful equity

What You’ll Own

  • Reliability bar: Set and enforce SLIs/SLOs/error budgets for critical user flows.
  • Production architecture resilience: Drive failure isolation across API, workers, queues, and dependencies so one subsystem cannot take down core access.
  • Kubernetes runtime reliability: Define probe contracts, rollout/rollback standards, graceful shutdown behavior, scaling/resource policies, and startup safety.
  • Queue job safety (BullMQ/Redis): Own poison pill containment and workload isolation.
  • Incident command quality: Lead Sev1/Sev2 response end-to-end (containment, communications, technical direction, RCA, corrective action execution).
  • Reliability operating system: Own observability quality (signals over noise), on-call effectiveness, runbooks, and postmortem discipline.
  • Release safety authority: Gate risky deploys and enforce reliability guardrails when production health is at risk.

Traits We’re Looking For

  • Calm, structured incident commander under pressure.
  • Thinks in failure modes and blast radius by default.
  • Pragmatic: can stabilize quickly, then implement durable fixes.
  • High ownership and strong written communication.

First 90 Days

  • Establish baseline reliability metrics and identify top platform risks.
  • Tighten incident response mechanics (roles, comms cadence, runbooks, status updates).
  • Deliver high-impact hardening fixes across probes/startup paths/queue safety.
  • Publish a prioritized 6–12 month reliability roadmap with clear ownership and milestones.

If possible please include one incident you personally led and send to Jordan@vizcom.com :

  • what failed,
  • how you contained it,
  • what permanent fixes you shipped, and measured.

Salary : $200,000 - $250,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior Platform & Reliability Engineer (SRE)?

Sign up to receive alerts about other jobs on the Senior Platform & Reliability Engineer (SRE) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Vizcom

  • Vizcom San Francisco, CA
  • Agency Notice: We are not currently working with recruiting agencies for this role. Please do not contact Vizcom employees regarding this position. Any res... more
  • 6 Days Ago

  • Vizcom San Francisco, CA
  • Executive Assistant to the CEO Location: San Francisco, CA (in-office) Who We Are: At Vizcom, we're reimagining how physical products are designed by fusin... more
  • 6 Days Ago


Not the job you're looking for? Here are some other Senior Platform & Reliability Engineer (SRE) jobs in the San Francisco, CA area that may be a better fit.

  • Embedding VC San Francisco, CA
  • 🎨 About OpenArt OpenArt is an AI Storytelling and Visual Creation Platform used by millions worldwide. We’re building the next generation of creative tool... more
  • 26 Days Ago

  • Saviynt San Francisco, CA
  • About Saviynt Saviynt is a leader in identity security, delivering an AI-powered platform that governs and secures access to applications, data, and busine... more
  • 29 Days Ago

AI Assistant is available now!

Feel free to start your new journey!