Software Engineer L4/l5 - Data and Feature Infrastructure, Machine Learning Platform

Netflix Netflix · Big Tech · United States · Remote · Data & Insights

Netflix is seeking a Software Engineer to build and scale a next-generation ML data and feature platform. This role will focus on creating infrastructure for defining, computing, storing, and serving ML features and labels, enabling ML practitioners to improve productivity and foster innovation across various domains like personalization, payments, and ads. The platform will support both high-throughput training and low-latency inference use cases, including a centralized feature and embedding store for sharing.

What you'd actually do

  1. Design and build a near-real-time feature computation engine to generate ML features for both high-throughput training and low-latency inference applications.
  2. Operate and manage the feature computation pipelines and feature serving infrastructure for various ML models across multiple ML domains.
  3. Build and scale systems that accelerate training through performant data loading, transformation, and writing.
  4. Create frameworks to streamline and expedite the availability of new data for training and serving.
  5. Develop feature stores that enable feature discovery and sharing.

Skills

Required

  • Experience in building ML or data infrastructure
  • Strong empathy and passion for providing a fantastic user experience to ML practitioners
  • Experience in building and operating 24/7 high-traffic and low-latency online applications
  • Experience with large-scale data processing frameworks such as Spark, Flink, and Kafka
  • Experience in working with and optimizing Scala and/or Python codebases
  • Experience with public clouds, especially AWS
  • Self-driven and highly motivated team player

Nice to have

  • Experience in building and operating ML feature stores
  • Experience with Functional Programming
  • Experience working with Notebooks such as Jupyter or Polynote

What the JD emphasized

  • building ML or data infrastructure
  • building and operating 24/7 high-traffic and low-latency online applications
  • building and operating ML feature stores

Other signals

  • building a next-generation ML data and feature platform
  • power ML models across various domains
  • centralized feature and embedding store
  • enable sharing across various ML domains
  • unlocking access to these shared datasets will foster innovation