Applied Scientist Ii, Alexa International Team

Amazon Amazon · Big Tech · Bellevue, WA · Applied Science

Applied Scientist II role focused on developing and evaluating LLMs and multimodal systems for Alexa's international products. Responsibilities include analyzing customer behavior, building evaluation metrics, fine-tuning/post-training LLMs (SFT, DPO, RLHF, RLAIF), setting up experimentation, and contributing to research and production delivery. Requires strong ML, NLU, LLM architecture, and evaluation knowledge, with a focus on international customer nuances and diverse data sources.

What you'd actually do

  1. Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants.
  2. Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF.
  3. Set up experimentation frameworks for agile model analysis and A/B testing.
  4. Collaborate with partner teams on LLM evaluation frameworks and post-training methodologies.
  5. Contribute to end-to-end delivery of solutions from research to production, including reusable science components.

Skills

Required

  • Deep learning
  • Generative models
  • LLMs
  • Multimodal systems
  • Machine learning
  • Natural language understanding
  • LLM architectures
  • LLM evaluation
  • Tooling
  • Java
  • C++
  • Python
  • Patents or publications at top-tier peer-reviewed conferences or journals

Nice to have

  • Professional software development

What the JD emphasized

  • novel algorithms and modeling techniques
  • international customers
  • international products and services
  • international customer nuances
  • LLM evaluation & tooling
  • pushing boundaries
  • swiftly delivering impactful solutions
  • novel online & offline evaluation metrics and methodologies
  • fine-tune/post-train LLMs
  • LLM evaluation frameworks
  • post-training methodologies
  • research to production
  • publications

Other signals

  • LLMs
  • multimodal systems
  • deep learning
  • generative models
  • international products
  • customer behavior
  • evaluation metrics
  • fine-tune/post-train LLMs
  • SFT
  • DPO
  • RLHF
  • RLAIF
  • experimentation frameworks
  • A/B testing
  • LLM evaluation frameworks
  • post-training methodologies
  • research to production
  • scientific community
  • publications