Staff Software Engineering, Youtube ML Efficiency

Google Google · Big Tech · San Bruno, CA +1

Staff Software Engineer focused on ML efficiency for YouTube's recommendation systems, working on optimizing models for next-gen TPUs, enabling new architectures and training procedures, and reducing complexity in the ML training and serving ecosystem through automation.

What you'd actually do

  1. Monitor the evolving landscape of recommendation systems, actively prototyping and benchmarking emerging modeling techniques to keep our infrastructure cutting-edge and efficient.
  2. Enable next-generation model architectures and training procedures.
  3. Scale experimentation capacity under our resource envelope.
  4. Reduce complexity and fragmentation in the ML training and serving ecosystem by providing standardized, composable, and reusable solutions.
  5. Reduce experimenter toil through introduction of automation frameworks for training, evaluation, and model serving.

Skills

Required

  • software development
  • ML design
  • optimizing ML infrastructure
  • model deployment
  • model evaluation
  • data processing
  • debugging
  • fine tuning
  • building large-scale recommendation systems
  • Machine Learning (ML)
  • ranking
  • personalization

Nice to have

  • ML models/algorithm design and implementation
  • collaboration
  • problem solving
  • quantitative reasoning
  • communication

What the JD emphasized

  • optimizing ML infrastructure
  • model deployment
  • model evaluation
  • data processing
  • fine tuning
  • building large-scale recommendation systems
  • Machine Learning (ML)
  • ranking
  • personalization

Other signals

  • improving performance and extracting maximum efficiency for machine learning and AI workloads
  • evolving YouTube's models for next TPU generations
  • prototyping and benchmarking emerging modeling techniques
  • enabling next-generation model architectures and training procedures
  • reducing complexity and fragmentation in the ML training and serving ecosystem
  • automation frameworks for training, evaluation, and model serving