AI Language Engineer I, Artificial General Intelligence - Data Services

Amazon Amazon · Big Tech · London, United Kingdom · Data Science

The AI Language Engineer will develop diverse datasets for training and evaluating Amazon's AI models, using methods like synthetic data generation, model-supported generation, and human-in-the-loop collections. This role involves designing data collections, analyzing data, building tools, and collaborating with cross-functional teams.

What you'd actually do

  1. Design complex data collections with human participants in response to science needs: author instructions, define and implement quality targets and mechanisms, provide day-to-day coordination of data collection efforts (including planning, scheduling, and reporting), and be responsible for the final deliverables
  2. Design and conduct complex data creation tasks using synthetic and model-based data generation methods, following state-of-the-art approaches
  3. Analyze and extract insights from large amounts of data
  4. Build tools or tool prototypes for data analysis or data creation, using Python or another scripting language
  5. Use modeling tools to bootstrap or test new AI functionalities

Skills

Required

  • Master's degree or above in Computational Linguistics
  • Master's degree or above in Linguistics or a related field
  • Experience in computational linguistics, language data processing, semantics, and philosophy of language
  • Experience in Python, Perl, or another scripting language
  • Experience in speech and language data analysis

Nice to have

  • Experience owning and executing language data collection projects, including guidelines, labelset and annotation workflow development

What the JD emphasized

  • complex data collections
  • synthetic and model-based data generation methods
  • state-of-the-art approaches

Other signals

  • develops diverse datasets to train and evaluate AI models
  • synthetic data generation
  • model-supported data generation
  • human-in-the-loop data collections