Applied Science Manager, Sponsored Products and Brands

Amazon Amazon · Big Tech · NY +1 · Applied Science

Manager for a Continuous Model Evaluation and Learning workstream within Amazon Ads' Sponsored Products and Brands team. The role involves leading a team of applied scientists and engineers to build and ship an evaluation and remediation framework for an agentic brand-intelligence system. This includes designing evaluation metrics, developing optimization engines for prompts and synthetic data, and ensuring offline-to-online consistency for quality improvements. The goal is to enable autonomous detect-diagnose-remediate loops to scale quality across brand skills.

What you'd actually do

  1. Lead, mentor, and grow the talent on a team composed of applied scientists and machine learning engineers, fostering a culture of scientific excellence, customer obsession, and ownership.
  2. Own the scientific vision and multi-quarter roadmap for continuous model evaluation and learning across the brand-intelligence system.
  3. Design and deliver evaluation frameworks for agentic brand-intelligence skills, including LLM-as-Judge rubrics, multi-model ensemble judging, gold-set construction, and calibration against human evaluators.
  4. Lead development of the optimization engine that programmatically refines prompts, generates synthetic training pairs, and composes agent decomposition strategies (orchestrator-worker patterns) when single-agent skills hit complexity limits.
  5. Establish rigorous offline-to-online consistency, A/B testing discipline, and drift monitoring so that quality improvements generalize to production traffic.

Skills

Required

  • 4+ years of applied research experience
  • 3+ years of scientists or machine learning engineers management experience
  • 3+ years of building machine learning models for business application experience
  • PhD, or Master's degree and 6+ years of applied research experience
  • Knowledge of ML, NLP, Information Retrieval and Analytics
  • Experience programming in Java, C++, Python or related language

Nice to have

  • Experience working on recommender systems or personalization within search, e-commerce, shopping, advertising or other related fields
  • Ph.D. in computer science, machine learning, engineering, or related fields, or Master's degree and 4+ years of a quantitative field such as statistics, mathematics, data science, business analytics, economics, finance, engineering, or computer science experience
  • Have publications at top-tier peer-reviewed conferences or journals
  • 5+ years of scientists or machine learning engineers management experience
  • Hands-on experience designing, deploying, and evaluating LLM-based agentic systems, including tool use, multi-step planning and reasoning, retrieval-augmented generation (RAG), and multi-agent orchestration (orchestrator-worker, swarm, critic-actor patterns).
  • Experience with large-scale LLM fine-tuning (SFT, RLHF, DPO), prompt engineering, and programmatic prompt optimization frameworks.

What the JD emphasized

  • own the quality backbone
  • deliver the evaluation and remediation framework
  • business-critical, greenfield initiative
  • ship the framework that every other brand-intelligence workstream depends on
  • Hands-on experience designing, deploying, and evaluating LLM-based agentic systems

Other signals

  • leading a team
  • shipping evaluation frameworks
  • driving continuous improvement
  • business-critical initiative