What are the responsibilities and job description for the Sr Software Engineer Storage, Search, & Data Platforms position at Uber?
About The Role
As a Senior Software Engineer in the Storage, Search, and Data (SSD) group, you will be at the heart of Uber's transition to a Cloud-Native Data Platform . We are moving away from traditional data processing toward a unified, elastic fabric that powers everything from exabyte-scale analytics to the "Agentic" AI that drives Uber's future.
In this role, you will take ownership of business-critical systems-whether that's scaling our Distributed MySQL footprint, optimizing Hudi-based Data Lakes , or building the storage layer. You are a "Full-Stack Infrastructure" engineer: someone who can write high-performance code, design resilient distributed systems, and ensure operational excellence for Tier-0 services that handle millions of concurrent trips.
What You Will Do
As a Senior Software Engineer in the Storage, Search, and Data (SSD) group, you will be at the heart of Uber's transition to a Cloud-Native Data Platform . We are moving away from traditional data processing toward a unified, elastic fabric that powers everything from exabyte-scale analytics to the "Agentic" AI that drives Uber's future.
In this role, you will take ownership of business-critical systems-whether that's scaling our Distributed MySQL footprint, optimizing Hudi-based Data Lakes , or building the storage layer. You are a "Full-Stack Infrastructure" engineer: someone who can write high-performance code, design resilient distributed systems, and ensure operational excellence for Tier-0 services that handle millions of concurrent trips.
What You Will Do
- Own & Execute: Lead the design and implementation of major features for Uber's storage and data platforms (e.g., Docstore, Pinot, or OpenSearch ).
- Cloud-Native Modernization: Build and optimize services that leverage GCP and OCI Object Storage , focusing on high-throughput metadata management and S3-compatible API support.
- Storage Optimization: Drive efficiency across our HDFS and Blobstore layers, using table formats like Apache Hudi or Iceberg to improve data freshness and reduce cost.
- AI/ML Integration: Work with AI teams to design high-performance data pipelines, ensuring our storage layers can handle the intense IO demands of GPU-based model training .
- Operational Leadership: Ensure 99.99% availability for your services. You will lead root-cause analyses (RCAs), improve observability, and mentor L3/L4 engineers on best practices for distributed systems.
- 6 Years of Engineering Experience: Proven track record of building and maintaining large-scale distributed systems .
- Deep Storage Knowledge: Practical, hands-on experience with:
- Relational & NoSQL: Distributed MySQL, Cassandra, or Redis.
- Batch & Object: HDFS, S3/GCS, and Metadata services.
- Distributed Systems: If you've worked on systems like Google Spanner or TiDB , you'll be a great fit for our Transactional Storage (Docstore) team.
- Coding Mastery: Expert-level proficiency in Java, Go, or C , with a strong focus on concurrency, memory management, and performance tuning.
- Query Engines: Experience with large-scale analytical engines like Presto, Hive, or Trino .
- Lakehouse Innovation: Experience with Apache Hudi, Iceberg, or Delta Lake for optimizing "Big Data" storage.
- Cloud Infrastructure: Deep familiarity with OCI or GCP and strategies for resource efficiency (the "E40" initiative).
- AI/ML Awareness: Understanding how data storage interacts with ML frameworks like Ray or PyTorch .
- Open Source Contribution: Active participation in community projects like Apache Pinot, Kafka, or Flink .
- Academic-Grade Engineering: Ability to apply research-level concepts (partnering with CMU, Berkeley, or MIT ) to solve real-world distributed consensus or indexing challenges.
Salary : $202,000 - $224,000