Amazon Core Search Organization - Language Engineer, Core Search

Amazon Amazon · Big Tech · Seattle, WA · Software Development

The Language Engineer role within Amazon's MIDAS team focuses on creating and managing data annotation guidelines and workflows to ensure the quality of labeled data used for training and evaluating AI models in the Search organization. This involves applying linguistic and scripting expertise to NLP and language understanding challenges, managing large datasets, and collaborating with scientists and engineers.

What you'd actually do

  1. Design and develop data annotation guidelines and workflows.
  2. Manage and process large amounts of structured and unstructured data.
  3. Adopt and design quality control metrics and methodology to evaluate the quality of data annotation.
  4. Maximize productivity, process efficiency and quality through streamlined workflows, process standardization, documentation, audits and investigations on a periodic basis.
  5. Handle annotation & data investigation requests from multiple stakeholders with high efficiency and quality in a fast-paced environment.

Skills

Required

  • computational linguistics
  • language data processing
  • semantics
  • philosophy of language
  • Regex
  • SQL
  • MS Excel
  • Git
  • Unix terminal
  • common command line tools
  • annotation tools and workflows
  • excellent communication
  • strong organizational skills
  • keen eye for details
  • fast-paced, collaborative, and dynamic work environment
  • support several projects at one time
  • accept reprioritization as necessary

Nice to have

  • data science
  • quantitative research
  • language annotation
  • data markup
  • machine learning
  • deep learning techniques
  • NLP
  • search
  • AWS services (S3, Sagemaker, ML language services, etc.)
  • user experience concepts and methods
  • online retail (e-commerce)

What the JD emphasized

  • computational linguistics
  • language data processing
  • semantics
  • philosophy of language
  • annotation tools and workflows
  • language annotation
  • data markup
  • machine learning
  • deep learning techniques
  • NLP
  • search

Other signals

  • data annotation guidelines
  • labeler-friendly annotation guidelines
  • data wrangling and analysis
  • labeled-data quality metrics
  • linguistic expertise
  • scripting expertise
  • natural language processing
  • language understanding challenges
  • quality control metrics
  • data annotation
  • machine learning
  • deep learning techniques
  • NLP
  • search