Data Engineer - Gen AI - Music

Spotify Spotify · Consumer · New York, NY · Music

Data Engineer for Spotify's Artist-First AI Music lab, focusing on building and maintaining large-scale data pipelines, including ML pipelines, using frameworks like Scio and Python on Google Cloud Platform to create generative music products.

What you'd actually do

  1. Build and maintain large-scale data pipelines, including ML pipelines, with data processing frameworks like Scio and Python-based tools on Google Cloud Platform.
  2. Leverage data engineering best practices in continuous integration and delivery.
  3. Help drive optimization, testing and tooling to improve data quality and reliability.
  4. Collaborate with engineers, product managers, subject matter experts, and stakeholders while taking on learning and leadership opportunities that arise every day.
  5. Work in cross-functional, agile teams to continuously experiment, iterate, and deliver on new product objectives.

Skills

Required

  • 3+ years of professional experience in a product-driven environment
  • Experience with high-volume, heterogeneous data using distributed systems and big data technologies (Python, Scala, Scio, Ray, Apache Spark)
  • Proficient in designing and building distributed data pipelines (Python, Scala, Java, Scio, Dataflow)
  • Understanding of data modeling, data access, and data storage techniques for batch and analytical processing (BigQuery)
  • Experience with continuous integration and delivery
  • Experience with Google Cloud Platform

Nice to have

  • Learning and leadership opportunities
  • Cross-functional, agile teams
  • Creative problem solver
  • Passionate about building outstanding products
  • Enthusiastic about turning research ideas into products at scale

Other signals

  • Generative AI products for music
  • Large-scale data pipelines, including ML pipelines
  • Data processing frameworks like Scio and Python-based tools on Google Cloud Platform