Staff Software Engineer : Storage, Search, & Data Platforms

Uber Uber · Consumer · Seattle, WA +2 · Engineering

Staff Engineer role focused on architecting and evolving Uber's data platforms to support a "Data-to-AI" future. This involves defining the technical vision for cloud-native data ecosystems, designing "Data-to-GPU" pipelines, and optimizing storage APIs for AI researchers. The role requires deep distributed systems expertise and leadership in open-source data technologies.

What you'd actually do

  1. Define and execute the multi-year roadmap to transition Uber from "Data Storage" to a Cloud-Native Data Provider, solving for cross-region latency, global metadata consistency, and exabyte-scale cost efficiency.
  2. Partner with Uber’s AI/ML leadership to architect the "Data-to-GPU" pipeline.
  3. Design the one-stop storage APIs that allow researchers to leverage high-performance data access across multi-cloud regions and vendors seamlessly.
  4. Drive the next generation of our core engines: Docstore (NoSQL), Vitess (Sharded MySQL), Apache Pinot (Real-time Analytics), and OpenSearch (Discovery).
  5. Represent Uber in the global community as a leader in key open source technologies including Apache, Hudi, Iceberg and many others

Skills

Required

  • 12+ years of software engineering experience
  • designing and operating massive-scale distributed data systems
  • Elite engineering skills in Go, Java, C++, or Rust
  • deep-diving into database internals
  • kernel-level optimizations
  • complex distributed consensus protocols
  • leading technical strategy across multiple teams or organizations
  • managing Tier-0, mission-critical systems with 99.99% availability and global blast-radius constraints

Nice to have

  • MS / PhD in Computer Science (or equivalent experience) with a focus on Distributed Systems, Database Internals, or Large-Scale AI Infra
  • Maintainer or PMC member of an industry-defining project (e.g., Apache Pinot, Ray, Iceberg, Lance, or Gravitino)
  • Deep understanding of modern AI hardware-software interfaces, including GPU memory management, high-bandwidth networking (RDMA/NCCL), and training data pipelines
  • Extensive experience leveraging S3/GCS/OCI to build disaggregated storage-compute architectures
  • track record of growing all levels of engineering talent, fostering a culture of technical excellence and scale

What the JD emphasized

  • massive-scale distributed data systems
  • Elite engineering skills in Go, Java, C++, or Rust
  • complex distributed consensus protocols
  • managing Tier-0, mission-critical systems with 99.99% availability and global blast-radius constraints
  • Deep understanding of modern AI hardware-software interfaces, including GPU memory management, high-bandwidth networking (RDMA/NCCL), and training data pipelines

Other signals

  • architect the "Data-to-GPU" pipeline
  • design the one-stop storage APIs that allow researchers to leverage high-performance data access across multi-cloud regions and vendors seamlessly
  • Deep understanding of modern AI hardware-software interfaces, including GPU memory management, high-bandwidth networking (RDMA/NCCL), and training data pipelines