Software Engineer, Data Orchestration

Stripe Stripe · Fintech · India · 8123 Data Infrastructure

Stripe's Big Data Infrastructure team is seeking a Software Engineer to build and maintain the time-based and event-based orchestration infrastructure that powers batch data pipelines. This role involves designing and building next-generation data platform products, ensuring operational excellence, and collaborating with internal teams to support their initiatives. The role also involves contributing to open-source projects like Apache Airflow and Iceberg.

What you'd actually do

  1. Design, build, and maintain innovative next-generation or first-generation versions of key Data Platform products, with an emphasis on usability, reliability, security, and efficiency.
  2. Design ergonomic APIs and abstractions that build a great customer experience for internal Stripes, that will in turn enhance the experience of millions of Stripe users.
  3. Ensure operational excellence and enable a highly available & reliable Data Orchestration platform across batch workloads.
  4. Collaborate nimbly with high-visibility teams and their stakeholders to support their key initiatives - while building a robust platform that benefits all of Stripe in the long term.
  5. Plan for the growth of Stripe’s infrastructure by unblocking, supporting, and communicating proactively with internal partners to achieve results.

Skills

Required

  • professional experience writing high quality production level code or software programs
  • interest in Data Infrastructure
  • experience operating or enabling large-scale, high-availability data pipelines from design, to execution and safe change management
  • experience developing, maintaining, and debugging distributed systems built with open source tools
  • experience building infrastructure-as-a-product with a strong focus on users needs
  • strong collaboration and communication skills
  • curiosity to continuously learn about new technologies and business processes
  • energized by delivering effective, user-first solutions through creative problem-solving and collaboration

Nice to have

  • Spark
  • Flink
  • Airflow
  • Python
  • Java
  • SQL
  • API design
  • Scala
  • Iceberg
  • Trino
  • Pinot
  • Kafka
  • Hive MetaStore
  • S3
  • designing APIs or building developer platforms
  • optimizing the end to end performance of distributed systems
  • scaling distributed systems in a rapidly moving environment
  • working with Airflow Infrastructure

What the JD emphasized

  • 8+ years of professional experience writing high quality production level code or software programs and interest in Data Infrastructure
  • Has experience operating or enabling large-scale, high-availability data pipelines from design, to execution and safe change management
  • Has experience developing, maintaining, and debugging distributed systems built with open source tools
  • Has experience building infrastructure-as-a-product with a strong focus on users needs