Lead Product Software Engineer - ML Data Systems

Disney Disney · Media · New York, NY +2

Lead Product Software Engineer for ML Data Systems at ESPN, focusing on building real-time data systems for a new short-form video recommendation engine. This role involves designing and operating scalable distributed systems, data processing platforms, and foundational capabilities like feature serving and model inference integration to power personalization and recommendation services. The engineer will partner with ML, Data Science, and Product teams, establish engineering standards, and mentor other engineers.

What you'd actually do

  1. Design, build, and operate highly scalable software systems and services that support content discovery, personalization, and recommendation experiences.
  2. Develop and maintain distributed data processing platforms and service architectures that power both online and offline product workflows.
  3. Build foundational platform capabilities, including feature serving, model inference integration, experimentation infrastructure, and recommendation delivery services.
  4. Design reliable APIs and service interfaces that enable personalization capabilities across multiple ESPN products and surfaces.
  5. Lead architecture and technical design efforts for systems that must operate with high availability, low latency, and large-scale traffic demands.

Skills

Required

  • 7+ years of experience building and maintaining production-grade data pipelines and distributed data processing systems
  • Strong experience with modern data processing frameworks such as Spark, Flink, Beam, Kafka Streams, or equivalent.
  • Experience designing and implementing real-time streaming data pipelines.
  • Proficiency with SQL and schema design for large-scale analytical datasets.
  • Familiarity with cloud data platforms (e.g., AWS) and modern data infrastructure components (e.g., data lakes, data warehouses, feature stores).
  • Experience supporting ML workflows (model training pipelines, feature engineering, data validation).
  • Strong knowledge of data quality frameworks and best practices, with hands-on experience using Databricks, Snowflake, and Apache Airflow for data pipeline orchestration and validation.
  • Solid software engineering skills with experience in Python, Java, Scala, or similar languages.
  • Strong problem-solving skills and ability to work independently in a fast-paced environment.

Nice to have

  • Prior experience building data infrastructure for personalization, recommendation systems, or other ML-powered products.
  • Familiarity with ML lifecycle tools (MLflow, TFX, Kubeflow) and MLOps best practices.
  • Experience implementing data validation, monitoring, and lineage tools (e.g., dbt tests, Snowflake data quality checks) to ensure high data integrity for ML models.
  • Knowledge of real-time ML serving architectures and online feature generation.
  • Experience optimizing large-scale data workflows for latency-sensitive applications.
  • Prior experience operating in 0→1 product development or startup environments.
  • Nice to have experience with tools/technologies such as Databricks, Snowflake, Kafka, AWS SQS, Kubernetes, and related cloud-native data platform components.

What the JD emphasized

  • real-time
  • high availability
  • low latency
  • large-scale traffic demands
  • real-time streaming data pipelines
  • real-time ML serving architectures
  • online feature generation
  • latency-sensitive applications

Other signals

  • real-time short-form video recommendation system
  • next-generation personalization experience
  • scalable distributed systems
  • data-intensive applications
  • foundational systems, APIs, data platforms, and infrastructure that support real-time personalization and recommendation services