Principal, Applied Scientist, Aws Applied AI Solutions

Amazon Amazon · Big Tech · Seattle, WA · Research Science

This role focuses on leading technical innovation in visual reasoning foundation models, specifically building a next-generation visual reasoning engine powered by frontier Large Video Models (LVMs). The goal is to create a system that rivals human understanding of the physical world, capable of interpreting natural language, navigating environments, and executing complex tasks. It sits at the intersection of LVMs, LLMs, and Agentic AI, requiring end-to-end ownership from research to production deployment, with a focus on advancing state-of-the-art and solving real-world business problems.

What you'd actually do

  1. Direct the technical vision for next-gen visual reasoning, pioneering the use of LVMs to solve high-dimensional spatial-temporal problems
  2. Design and implement novel deep learning architectures combining a multitude of modalities, including image, video, and geospatial data
  3. Solve computational challenges to train foundation models at scale, taking advantage of latest developments in hardware and deep learning libraries
  4. Architect scalable solutions that deliver real-time insights across diverse physical environments
  5. Build agentic AI systems that autonomously execute end-to-end workflows, transforming visual data into actionable business intelligence

Skills

Required

  • PhD in computer science, machine learning, engineering, or related fields
  • 8+ years of applied research experience
  • publications on top-tier conferences, such as CVPR, ICCV, ECCV or NeurIPS
  • Deep expertise in architecting and training frontier Vision-Language Models (VLMs) or Large Video Models (LVMs), with proven ability to design novel algorithms that advance the state of the art in spatiotemporal understanding
  • Experience translating research into production systems at scale
  • Excellent programming skills in Python and deep learning frameworks (PyTorch, TensorFlow)

Nice to have

  • Deep expertise in World Models, Neural Radiance Fields (NeRFs/Gaussian Splatting), and long-horizon spatiotemporal reasoning

What the JD emphasized

  • publications on top-tier conferences
  • Deep expertise in architecting and training frontier Vision-Language Models (VLMs) or Large Video Models (LVMs)
  • proven ability to design novel algorithms that advance the state of the art in spatiotemporal understanding
  • Experience translating research into production systems at scale

Other signals

  • driving technical innovation
  • architecting novel solutions
  • delivering breakthrough results
  • building a next-generation visual reasoning engine
  • advancing the state of the art
  • solving real-world business problems
  • deploying at unprecedented scale