What you'd actually do

build agentic AI solutions and multi-modal deep learning models that understand how products and packages flowing through Amazon’s fulfillment network

build models that solve challenging problems like understanding warehouse operations systems, or visual defect detection on Amazon's entire retail catalog (billions of different items, thousands of new items every day)

work with a diverse set of very large multi-modal real-world datasets, including imagery, natural language and structured data

face a high level of research ambiguity and problems that require creative, ambitious, and inventive solutions

adapt state-of-the-art agentic AI, deep learning, language understanding and computer vision techniques to develop solutions for business problems in the Amazon Fulfillment Network

Skills

Required

Python
C++
PyTorch
Pandas
NumPy
scikit-learn
Hugging Face Transformers
transformers
diffusion models
neural architecture search
self-supervised learning
distributed training
mixed precision
gradient accumulation
DeepSpeed
FSDP
Megatron-LM
quantization
pruning
distillation
large language models (GPT, LLaMA, Claude)
vision-language models (CLIP, LLaVA, Qwen)
agentic AI systems
LangChain
Strands
multi-agent workflows
tool-augmented reasoning systems
RAG systems
chain-of-thought
few-shot
RLHF
DPO
computer vision
object detection
segmentation
3D vision
depth estimation
point cloud processing
natural language processing
text generation
information extraction
multimodal learning
model serving infrastructure
A/B testing frameworks
feature stores
MLOps
annotation pipeline design
active learning pipelines
AutoML
hyperparameter optimization

Nice to have

diffusion models for image/video synthesis
autoregressive models for multimodal generation
compositional generation systems
controllable generation
style transfer
neural rendering techniques
model interpretability and explainability methods
attention visualization
feature attribution
interpretable AI systems
few-shot learning
meta-learning
continual learning
domain adaptation
models that generalize across distribution shifts
long-tail scenarios
adapt to new tasks with minimal data

Are you excited about developing agentic AI, LLM and computer vision models that revolutionize Amazon's Fulfillment network? Are you looking for opportunities to apply state-of-the-art AI on real-world problems at truly vast scale? At Amazon Fulfillment Technologies and Robotics, we are on a mission to build high-performance autonomous systems that perceive and act to further improve our world-class customer experience — at Amazon scale. To this end, we are looking for an Applied Scientist who will build and deploy models that make smarter decisions on a wide array of multi-modal signals. Together, we will be pushing beyond the state of the art in optimizing one of the most complex systems in the world: Amazon's Fulfillment Network.

Key job responsibilities In this role, you will build agentic AI solutions and multi-modal deep learning models that understand how products and packages flowing through Amazon’s fulfillment network. You will build models that solve challenging problems like understanding warehouse operations systems, or visual defect detection on Amazon's entire retail catalog (billions of different items, thousands of new items every day). You will work with a diverse set of very large multi-modal real-world datasets, including imagery, natural language and structured data. You will face a high level of research ambiguity and problems that require creative, ambitious, and inventive solutions.

A day in the life AFT AI delivers the AI solutions that empower Amazon’s fulfillment network to make smarter decisions. You will work on an interdisciplinary project involving scientists and engineers with deep expertise in developing state-of-the-art AI solutions at scale. You will work with images, videos, natural language, and sequences of events from existing or new hardware. You will adapt state-of-the-art agentic AI, deep learning, language understanding and computer vision techniques to develop solutions for business problems in the Amazon Fulfillment Network.

About the team Amazon Fulfillment Technologies (AFT) powers Amazon’s global fulfillment network. We invent and deliver software, hardware, and science solutions that orchestrate processes, robots, machines, and people. We harmonize the physical and virtual world so Amazon customers can get what they want, when they want it.

AFT AI is spread across NA (Bellevue, WA) and Europe (Berlin, Germany). We are hiring candidates to work out of the Berlin location.

Publicly available articles showcasing some of our work:

Visual Defect Detection: https://www.amazon.science/blog/novel-kaputt-dataset-sets-new-benchmark-for-large-scale-visual-defect-detection
Eluna: https://www.aboutamazon.com/news/operations/new-robots-amazon-fulfillment-agentic-ai

Basic Qualifications

5+ years of relevant, broad research experience after a PhD degree or equivalent qualification
Track record of first-author publications at top-tier peer-reviewed conferences (NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP) or patents in machine learning domains
Expert-level programming proficiency in Python with production-quality code standards, plus working knowledge of C++ for performance-critical applications; deep technical expertise with PyTorch and proficiency with the modern ML stack (Pandas, NumPy, scikit-learn, Hugging Face Transformers)
Proven ability to independently scope, design, and execute end-to-end ML projects from research through production deployment, including ownership of model monitoring, maintenance, and iterative improvement
Proven expertise in modern deep learning architecture design including transformers, diffusion models, and neural architecture search, with hands-on experience in designing and training self-supervised learning paradigms, training optimization techniques (distributed training across multi-node GPU clusters, mixed precision, gradient accumulation, parallelism strategies using DeepSpeed, FSDP, or Megatron-LM), and model compression methods (quantization, pruning, distillation)
Proven experience pre-training and fine-tuning large language models (GPT, LLaMA, Claude) and vision-language models (CLIP, LLaVA, Qwen)
Proven experience developing agentic AI systems deployed to production, using state-of-the-art frameworks (LangChain, Strands, etc.) with proven ability to design multi-agent workflows, tool-augmented reasoning systems, RAG systems and advanced prompt engineering techniques (chain-of-thought, few-shot, RLHF, DPO)
Extensive knowledge and proven production experience across multiple ML domains including computer vision (object detection, segmentation, 3D vision, depth estimation, point cloud processing), natural language processing (text generation, information extraction), and multimodal learning
Strong understanding of ML systems design including model serving infrastructure, A/B testing frameworks, feature stores, and MLOps best practices, such as annotation pipeline design, active learning pipelines, and AutoML/hyperparameter optimization techniques

Preferred Qualifications

Hands-on experience with cutting-edge generative AI techniques including diffusion models for image/video synthesis, autoregressive models for multimodal generation, and compositional generation systems; expertise in controllable generation, style transfer, and neural rendering techniques
Deep expertise in model interpretability and explainability methods (attention visualization, feature attribution), with proven experience deploying interpretable AI systems in regulated or high-stakes production environments
Experience with specialized ML domains such as few-shot learning, meta-learning, continual learning, or domain adaptation; proven ability to build models that generalize across distribution shifts, handle long-tail scenarios, or adapt to new tasks with minimal data
Proven experience designing large language models (GPT, LLaMA, Claude) and vision-language models (CLIP, LLaVA, Qwen)
Experience leading cross-functional ML initiatives involving multiple teams or organizations, with demonstrated impact on company-wide metrics or strategic product launches; proven track record of mentoring junior scientists and engineers in advanced ML techniques
Published research contributions beyond first-authorship, including senior or corresponding author publications, invited talks at major conferences, or recognized leadership in ML research communities (program committee service, workshop organization, tutorial presentations)

Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice (https://www.amazon.jobs/en/privacy_page) to know more about how we collect, use and transfer the personal data of our candidates.

m/w/d

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

AFT AI is spread across NA (Bellevue, WA) and Europe (Berlin, Germany). We are hiring candidates to work out of the Berlin location.

Publicly available articles showcasing some of our work:

Visual Defect Detection: https://www.amazon.science/blog/novel-kaputt-dataset-sets-new-benchmark-for-large-scale-visual-defect-detection
Eluna: https://www.aboutamazon.com/news/operations/new-robots-amazon-fulfillment-agentic-ai

Basic Qualifications

5+ years of relevant, broad research experience after a PhD degree or equivalent qualification
Track record of first-author publications at top-tier peer-reviewed conferences (NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP) or patents in machine learning domains
Expert-level programming proficiency in Python with production-quality code standards, plus working knowledge of C++ for performance-critical applications; deep technical expertise with PyTorch and proficiency with the modern ML stack (Pandas, NumPy, scikit-learn, Hugging Face Transformers)
Proven ability to independently scope, design, and execute end-to-end ML projects from research through production deployment, including ownership of model monitoring, maintenance, and iterative improvement
Proven expertise in modern deep learning architecture design including transformers, diffusion models, and neural architecture search, with hands-on experience in designing and training self-supervised learning paradigms, training optimization techniques (distributed training across multi-node GPU clusters, mixed precision, gradient accumulation, parallelism strategies using DeepSpeed, FSDP, or Megatron-LM), and model compression methods (quantization, pruning, distillation)
Proven experience pre-training and fine-tuning large language models (GPT, LLaMA, Claude) and vision-language models (CLIP, LLaVA, Qwen)
Proven experience developing agentic AI systems deployed to production, using state-of-the-art frameworks (LangChain, Strands, etc.) with proven ability to design multi-agent workflows, tool-augmented reasoning systems, RAG systems and advanced prompt engineering techniques (chain-of-thought, few-shot, RLHF, DPO)
Extensive knowledge and proven production experience across multiple ML domains including computer vision (object detection, segmentation, 3D vision, depth estimation, point cloud processing), natural language processing (text generation, information extraction), and multimodal learning
Strong understanding of ML systems design including model serving infrastructure, A/B testing frameworks, feature stores, and MLOps best practices, such as annotation pipeline design, active learning pipelines, and AutoML/hyperparameter optimization techniques

Preferred Qualifications

Hands-on experience with cutting-edge generative AI techniques including diffusion models for image/video synthesis, autoregressive models for multimodal generation, and compositional generation systems; expertise in controllable generation, style transfer, and neural rendering techniques
Deep expertise in model interpretability and explainability methods (attention visualization, feature attribution), with proven experience deploying interpretable AI systems in regulated or high-stakes production environments
Experience with specialized ML domains such as few-shot learning, meta-learning, continual learning, or domain adaptation; proven ability to build models that generalize across distribution shifts, handle long-tail scenarios, or adapt to new tasks with minimal data
Proven experience designing large language models (GPT, LLaMA, Claude) and vision-language models (CLIP, LLaVA, Qwen)
Experience leading cross-functional ML initiatives involving multiple teams or organizations, with demonstrated impact on company-wide metrics or strategic product launches; proven track record of mentoring junior scientists and engineers in advanced ML techniques
Published research contributions beyond first-authorship, including senior or corresponding author publications, invited talks at major conferences, or recognized leadership in ML research communities (program committee service, workshop organization, tutorial presentations)

m/w/d

Applied Scientist Iii, Aft Ai, Amazon Aft AI

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Basic Qualifications

Preferred Qualifications

Basic Qualifications

Preferred Qualifications