Language Data Scientist, Alexa International

Amazon Amazon · Big Tech · Bellevue, WA · Data Science

This role focuses on analyzing and evaluating conversational interaction data to support the training and evaluation of LLMs and machine learning models for Alexa's speech interfaces. The Language Data Scientist will own data analysis, research requests, and contribute to developing annotation workflows and evaluation conventions.

What you'd actually do

  1. Own data analyses for customer-facing features, including launch go/no-go metrics for new features and accuracy metrics for existing features
  2. Handle unique data analysis requests from a range of stakeholders, including quantitative and qualitative analyses to elevate customer experience with speech interfaces
  3. Lead and evaluate changing dialog evaluation conventions, test tooling developments, and pilot processes to support expansion to new data areas
  4. Continuously evaluate workflow tools and processes and offer solutions to ensure they are efficient, high quality, and scalable
  5. Provide expert support for a large and growing team of data analysts

Skills

Required

  • 2+ years of data scientist experience
  • 3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience
  • 3+ years of machine learning/statistical modeling data analysis tools and techniques, and parameters that affect their performance experience
  • 1+ years of guiding and coaching a group of researchers experience
  • 1+ years of working with or evaluating AI systems experience
  • Master's degree or above in Science, Technology, Engineering, or Mathematics (STEM), or experience working in Science, Technology, Engineering, or Mathematics (STEM)
  • Experience applying theoretical models in an applied environment

Nice to have

  • Ph.D. in Science, Technology, Engineering, or Mathematics (STEM)
  • Knowledge of machine learning concepts and their application to reasoning and problem-solving
  • Experience in Python, Perl, or another scripting language
  • Experience in a ML or data scientist role with a large technology company
  • Experience in defining and creating benchmarks for assessing GenAI model performance
  • Experience working on multi-team, cross-disciplinary projects
  • Experience applying quantitative analysis to solve business problems and making data-driven business decisions
  • Experience effectively communicating complex concepts through written and verbal communication

What the JD emphasized

  • evaluating AI systems
  • defining and creating benchmarks for assessing GenAI model performance

Other signals

  • evaluating LLMs
  • data analysis for training
  • speech interfaces
  • customer experience metrics