Staff Software Engineer, Big Data Storage

Pinterest Pinterest · Consumer · Palo Alto, CA · Data Engineering

Staff Software Engineer to build and optimize Pinterest's exabyte-scale big data storage platform, focusing on Apache Iceberg and powering ML/AI innovations. The role involves technical leadership, designing scalable storage systems, and collaborating with data, ML/AI, and infrastructure teams.

What you'd actually do

  1. Design, implement, and optimize Pinterest’s exabyte-scale data lake storage platform.
  2. Lead complex technical projects and initiatives for data lake storage and metadata management, driving them from architecture through execution.
  3. Collaborate with stakeholders and partner teams across the organization to design storage and metadata layer technologies that unlock big data and ML/AI innovations.
  4. Build storage capabilities that efficiently support large-scale ML/AI workloads, including high-throughput data access, schema evolution, and large-scale column backfills.
  5. Shape the long-term technical direction for scalable, reliable, and efficient big data storage systems.

Skills

Required

  • 8+ years of relevant industry experience designing and building large-scale production distributed systems.
  • Strong experience designing and maintaining scalable storage, metadata, or data lake infrastructure.
  • Experience building storage capabilities for large-scale ML/AI or analytics workloads, including high-throughput data access, schema evolution, and large-scale column backfills.
  • Deep knowledge with building distributed systems, data storage systems, and production infrastructure.
  • Experience with big data technologies such as Apache Iceberg, Spark, Flink, Presto/Trino, Hive, or similar systems.
  • Proficiency in programming languages like Java, Scala, or Python.
  • Proven ability to lead complex technical initiatives and influence architecture across teams.
  • Strong collaboration, communication, and problem-solving skills, with a drive for technical excellence and innovation.
  • Bachelor’s degree in a relevant field such as Computer Science, or equivalent experience

What the JD emphasized

  • exabyte-scale
  • large-scale ML/AI workloads
  • large-scale column backfills