What are the responsibilities and job description for the Infrastructure Engineer position at Venture Up?
Infrastructure Engineer – San Francisco - $185K - $235K Equity
*Visa sponsorship not available, US Citizens only. Please do not apply if you are seeking sponsorship*
Infrastructure Engineer is required to join a startup who is building training grounds for AI agents. Today, most AI models learn from huge amounts of human-created data. Company’s idea is that AI should also learn by using computers and gaining experience, similar to how humans learn by doing. The company builds reinforcement learning (RL) environments specifically for computer use. The team consists of six people, primarily Princeton graduates, and is based in San Francisco. They raised a seed round under a year ago and are focused on scaling the team size.
What We're Looking For
Experience
- Computer Science degree from top-30 program.
- 5 - 12 years of experience years’ operating production cloud infrastructure end-to-end
- Experience building production cloud infrastructure from scratch (GCP/AWS)
- Experience managing Kubernetes or container orchestration in production. Designed CI/CD pipelines and PostgreSQL schemas from scratch.
- Strong Terraform or equivalent IaC at significant scale, GCP or AWS production cloud infrastructure proficiency
- Container orchestration & delivery — Kubernetes, CI/CD pipelines, Dockerfile optimization, registry workflows
- Data, networking & operations — PostgreSQL, VPCs/DNS/TLS, IAM/secrets management, observability (Prometheus/Datadog/etc.)
- Experience at early-stage or high-growth startup
What you'll do:
- Design and build production cloud infrastructure from the ground up — including networking, IAM, secrets management, and container orchestration
- Take ownership of Terraform modules and IaC workflows end-to-end, reasoning carefully about blast radius and failure modes
- Work closely with a small, highly technical founding team (reporting directly to the CTO) to architect infrastructure that supports rapidly scaling RL environments for agentic AI
- Build and maintain CI/CD pipelines and release processes that ship software reliably and stop broken changes before production
- Stand up and manage PostgreSQL databases from scratch — schema design, migration strategy, and production-scale operations
- Design and implement observability stacks (logs, metrics, traces) so the team can diagnose and resolve incidents quickly
- Future-proof platform infrastructure as AI capabilities evolve on a 3-month cadence, building beyond just servicing today's demand
Tech stack
GCP, Terraform, Kubernetes, GKE, Docker, PostgreSQL, Cloud SQL, CI/CD, Prometheus, Grafana, Datadog, OpenTelemetry, Alembic, Atlantis, Cloud Build, Artifact Registry, Auth0, OIDC, AWS
Salary : $185,000 - $235,000