Backend Engineer, Data

Stripe Stripe · Fintech · Canada · 8122 Data Foundations

Backend Engineer, Data role at Stripe focused on building and maintaining data pipelines, models, and products. The role involves leveraging LLMs and Agents for data production, refining data marts for forecasting, and building data services for product metrics. Requires strong software engineering and data background, experience with distributed data frameworks like Spark, and knowledge of backend languages and SQL.

What you'd actually do

  1. Design, develop, and own data pipelines, models, and products that power the Product, Data Science, and GTM functions
  2. Develop strong subject matter expertise and manage the SLAs for both data pipelines and full stack web applications that support these critical stakeholders
  3. Build and refine Stripe's data foundations - infrastructure, pipelines, and tools to enable various teams at Stripe - working with Scala, Spark, and Airflow
  4. Leverage LLM and Agents at scale to produce high-quality data on ambiguous problems
  5. Refine our existing data marts that help the GTM organization forecast the future potential performance of the business and reliably measure ongoing attainment toward targets

Skills

Required

  • Software Engineering
  • Data Engineering
  • Data Pipelines
  • Data Modeling
  • Distributed Data Frameworks (Spark, Hadoop, Pig)
  • Scala
  • Spark
  • Airflow
  • SQL
  • Backend Development Languages (Scala, Java, Go)

Nice to have

  • Data Marts
  • Product Teams
  • GTM Teams

What the JD emphasized

  • 6+ years of experience
  • strong background in software engineering and data
  • writing and debugging data pipelines using a distributed data framework (Spark / Hadoop / Pig etc)
  • inquisitive nature in diving into data inconsistencies to pinpoint issues, and resolve deep rooted data quality issues
  • backend development language (such as Scala, Java, or Go) and strong SQL experience
  • communicate cross-functionally, derive requirements and architect shared datasets

Other signals

  • Leverage LLM and Agents at scale to produce high-quality data on ambiguous problems
  • Refine our existing data marts that help the GTM organization forecast the future potential performance of the business and reliably measure ongoing attainment toward targets
  • Build data services that track key product metrics and measure the impact of different strategies employed by teams in the field