What are the responsibilities and job description for the Platform Engineer position at Shields Group Search?

Site Reliability Engineer / Platform Engineer

Location: San Francisco, CA on site

Compensation: $200,000 to $250,000 base salary, plus bonus and equity

Overview

Shields Group Search is partnering with a fast-growing, Series A AI infrastructure company building the connective layer between AI agents and the tools people use every day, including GitHub, Gmail, Notion, Salesforce, and more.

The company is building core infrastructure that allows agents to safely and reliably communicate with external tools, execute workflows, manage authentication, run code, trigger actions, and operate across real-world software environments.

They recently raised a $25M Series A from top-tier investors and have seen rapid revenue growth, with customers ranging from early AI-native startups to major technology companies.

This role is for a hands-on Site Reliability Engineer / Platform Engineer who can help scale, harden, and own the company’s infrastructure as usage grows. The team is looking for someone with real production experience managing cloud infrastructure, reliability, observability, deployment systems, and high-availability backend services.

This is an individual contributor role. Management experience is not required.

The ideal candidate has hands-on experience across SRE, DevOps, backend engineering, infrastructure engineering, cloud platforms, distributed systems, and performance optimization. They should be comfortable owning infrastructure in a fast-moving startup environment and should have evidence that they build, experiment, and go deep outside of assigned work.

What You’ll Do

Own reliability, scalability, observability, and performance across core production infrastructure
Manage and improve infrastructure across cloud platforms such as AWS, Vercel, and related systems
Build and improve the platform infrastructure supporting AI agent workflows, tool execution, authentication, triggers, APIs, sandboxes, and runtime orchestration
Design and operate reliable backend systems that interact with many third-party tools and APIs
Improve infrastructure supporting high-throughput, distributed, cloud-native services
Work across cloud infrastructure, Linux systems, containers, deployment pipelines, service orchestration, CI/CD, and observability tooling
Build automation that reduces operational burden and improves incident response
Develop internal productivity tooling, runbooks, monitoring, alerting, dashboards, and reliability workflows
Debug complex production issues across application, infrastructure, network, database, deployment, and runtime layers
Improve system performance through tracing, profiling, database query optimization, workflow optimization, CPU/heap profiling, and deep root-cause analysis
Help manage and improve multiple execution environments, including serverless runtimes, sandboxed code execution, and related backend systems
Partner closely with product engineers and customers to support important workloads and improve the platform in the process
Write clear documentation that explains complex systems, operational patterns, and infrastructure decisions
Help define the reliability culture, infrastructure standards, and technical bar for a small, high-craft engineering team

What They’re Looking For

4 years of software engineering, site reliability engineering, infrastructure engineering, DevOps, platform engineering, or distributed systems experience preferred, but not a hard requirement for exceptional candidates
Hands-on experience managing production infrastructure across cloud environments
Experience with AWS, Vercel, Kubernetes, Linux, containers, deployment systems, observability tools, or similar infrastructure
Strong backend engineering fundamentals and ability to write production-quality code
Experience with monitoring, tracing, logging, alerting, incident response, and system performance
Experience scaling and operating distributed systems, microservices, APIs, databases, queues, or high-throughput backend services
Ability to debug hard production issues across many layers of the stack
Strong systems thinking and ability to understand how infrastructure, application code, databases, deployments, and customer-facing workflows interact
Ability to build automation and tooling that makes engineering teams faster and more reliable
Clear written communication and ability to explain technical decisions simply
High ownership, high urgency, and comfort operating without a playbook
Strong engineering taste: simple systems, clean abstractions, pragmatic architecture, and reliable execution

Strong Signals

Experience managing infrastructure for a fast-growing startup or high-scale technical product
Experience with AWS, Vercel, Kubernetes, Docker, Terraform, CI/CD, observability, deployment automation, or cloud-native infrastructure
Experience building or operating developer infrastructure, API platforms, automation platforms, workflow engines, internal tooling, or AI infrastructure
Experience with performance engineering, capacity planning, service decoupling, cloud migrations, or reliability improvement initiatives
Strong side projects, open-source work, technical writing, infrastructure tools, hardware experiments, embedded systems projects, or other evidence of building outside of work
Startup experience or experience in fast-moving, ambiguous technical environments
Evidence of being internet-native, builder-minded, and deeply curious about how systems work
Ability to move between SRE, platform, backend, infrastructure, and product engineering work without being overly rigid about title or scope

You May Be a Fit If

You have owned or managed production infrastructure directly
You are comfortable debugging cloud systems, backend services, databases, deployments, and networking issues
You can write code and do not see SRE as separate from engineering
You care about uptime, performance, observability, developer experience, and clean operational workflows
You build tools to eliminate repetitive operational work
You like fast-moving startup environments with high ownership
You have side projects, open-source contributions, technical writing, or other artifacts that show how you think
You are comfortable being the person who figures things out when there is no playbook
You want to work with a small team of intense, high-craft engineers building at the frontier of AI infrastructure

Salary : $200,000 - $250,000

Apply for this job

Receive alerts for other Platform Engineer job openings

Platform Engineer

What are the responsibilities and job description for the Platform Engineer position at Shields Group Search?

What is the career path for a Platform Engineer?

Job openings at Shields Group Search

Not the job you're looking for? Here are some other Platform Engineer jobs in the San Francisco, CA area that may be a better fit.

We don't have any other Platform Engineer jobs in the San Francisco, CA area right now.

AI Assistant is available now!