Amazon Core Search Organization - Language Engineer, Core Search

Amazon Amazon · Big Tech · Seattle, WA · Software Development

The Language Engineer role within Amazon's MIDAS team focuses on creating and managing data annotation guidelines and workflows to support AI model training and evaluation for search quality. This involves data wrangling, analysis, defining UI templates, and reporting on data quality metrics, requiring linguistic and scripting expertise to address NLP and language understanding challenges. The role collaborates with scientists, engineers, and product managers to improve various aspects of Amazon Search, including semantic matching, ranking, computer vision, image processing, and augmented reality.

What you'd actually do

  1. Design and develop data annotation guidelines and workflows.
  2. Manage and process large amounts of structured and unstructured data.
  3. Adopt and design quality control metrics and methodology to evaluate the quality of data annotation.
  4. Maximize productivity, process efficiency and quality through streamlined workflows, process standardization, documentation, audits and investigations on a periodic basis.
  5. Handle annotation & data investigation requests from multiple stakeholders with high efficiency and quality in a fast-paced environment.

Skills

Required

  • computational linguistics
  • language data processing
  • semantics
  • philosophy of language
  • Python scripting language
  • Regex
  • SQL
  • MS Excel
  • Git
  • Unix terminal
  • common command line tools
  • annotation tools and workflows
  • communication skills
  • organizational skills
  • attention to detail

Nice to have

  • French
  • German
  • Dutch
  • Italian
  • Spanish
  • Japanese
  • data science
  • quantitative research
  • language annotation
  • data markup
  • machine learning
  • deep learning
  • NLP
  • search
  • AWS services
  • Sagemaker
  • ML language services
  • user experience concepts
  • online retail
  • e-commerce

What the JD emphasized

  • end-to-end human annotation requirements
  • labeler-friendly annotation guidelines
  • data wrangling and analysis
  • labeled-data quality metrics
  • linguistic expertise
  • scripting expertise
  • natural language processing
  • language understanding challenges
  • data annotation guidelines and workflows
  • quality control metrics and methodology
  • quality of data annotation
  • stakeholder requirements
  • customer experience
  • semantic matching
  • ranking
  • computer vision
  • image processing
  • augmented reality