Machine Learning Scientist (l4/l5) - Multi-modal Algorithms for Games

Netflix Netflix · Big Tech · Los Gatos, CA +2 · Data & Insights

Machine Learning Scientist role focused on research and development of LLMs, VLMs, and multi-modal foundations for games, with a strong emphasis on inference efficiency, model optimization (distillation, pruning), and generative visuals. The role involves fine-tuning, alignment, and integrating models for real-time interaction and cost-effectiveness.

What you'd actually do

  1. Model Adaptation & Alignment: Design and own the fine-tuning and alignment of LLMs and VLMs in PyTorch, leveraging modern preference learning and reinforcement learning to enhance reasoning, tool-use, and agentic workflows for interactive game systems.
  2. Algorithmic Model Optimization: Lead efforts in model compression—specifically knowledge distillation, structural pruning, and architectural refinement—to create efficient variants of large models that meet strict latency, cost, and quality constraints.
  3. Generative Visuals & Diffusion: Develop and optimize Diffusion-based models for Image, Video, and 3D generation, including distillation and efficiency techniques for viable game-time performance.
  4. Pragmatic Model Integration: Strategically evaluate and integrate SOTA open-source and commercial models while building internal "layers," adapters, and enhancements to fill gaps in creative control.
  5. Multi-modal Interaction: Optimize and integrate audio (ASR/TTS), language, and vision models to enable low-latency, cross-modal reasoning and interaction.

Skills

Required

  • Python
  • PyTorch
  • LLMs
  • VLMs
  • Transformers
  • Diffusion architectures
  • knowledge distillation
  • quantization-aware training
  • pruning
  • data cleaning
  • data curation
  • synthetic data generation
  • commercial API evaluation
  • OSS model integration

Nice to have

  • heterogeneous hardware optimization (Mobile, Cloud GPU, edge devices)
  • audio-visual multimodal models
  • video generation

What the JD emphasized

  • inference efficiency
  • cost
  • latency
  • quality
  • algorithmic optimization
  • model compression
  • knowledge distillation
  • structural pruning
  • architectural refinement
  • Diffusion-based models
  • game-time performance
  • low-latency

Other signals

  • LLMs
  • VLMs
  • multi-modal
  • inference efficiency
  • model optimization