Applied Scientist, Prime Video - Content Localization, Understanding & Enrichment

Amazon Amazon · Big Tech · Seattle, WA · Applied Science

Applied Scientist role at Amazon Prime Video focusing on building multi-modal machine learning technologies for content localization, understanding, and enrichment. The role involves applying state-of-the-art NLP and computer vision research, particularly with vision-language models and multimodal LLMs, to understand video content, including long-context understanding and causal reasoning across temporal sequences. The goal is to power next-generation video understanding and search capabilities for Prime Video.

What you'd actually do

  1. Our team builds multi-modal machine learning technologies to enrich and understand video content.
  2. We aim not only to understand individual components within the content itself, but also their relationships to each other to provide a holistic and broader contextual understanding.
  3. This powers the next generation of video understanding and search capabilities for Prime Video.

Skills

Required

  • building models for business application experience
  • CS, CE, ML or related field experience
  • programming in Java, C++, Python or related language
  • developing and implementing deep learning algorithms
  • computer vision algorithms

Nice to have

  • Unix/Linux
  • professional software development
  • publications at top-tier peer-reviewed conferences or journals

What the JD emphasized

  • state of the art natural language processing and computer vision research
  • vision-language models/multimodal LLMs
  • long-form content understanding
  • long-context understanding
  • causal reasoning across extended temporal sequences
  • building models for business application experience
  • deep learning algorithms, particularly with respect to computer vision algorithms

Other signals

  • multi-modal machine learning technologies
  • long-form content understanding
  • vision-language models
  • multimodal LLMs