Applied Scientist, Silicon and Systems Group Edge AI

Amazon Amazon · Big Tech · Cambridge, MA, United Kingdom · Machine Learning Science

Research Scientist role focused on developing novel evaluation methods for multimodal language models and agents for consumer devices. This involves creating and validating automated evaluation techniques, analyzing datasets to understand model gaps, and collaborating with training teams. The role emphasizes hardware-software integration for efficient model training and deployment on edge devices.

What you'd actually do

  1. Collaborate with cross-functional engineers and scientists to advance the state of the art in multimodal model evaluations for devices, including audio, images, and videos
  2. Invent and validate reliability for novel automated evaluation methods for perception tasks, such as fine-tuned LLM-as-judge
  3. Develop and extend our evaluation framework(s) to support expanding capabilities for multimodal language models
  4. Analyze large offline and online datasets to understand model gaps, develop methods to interpret model failures, and collaborate with training teams to enhance model capabilities for product use cases
  5. Work closely with other scientists, compiler engineers, data collection, and product teams to advance evaluation methods

Skills

Required

  • PhD, or a Master's degree and experience in CS, CE, ML or related field
  • Experience in patents or publications at top-tier peer-reviewed conferences or journals
  • Experience programming in Java, C++, Python or related language
  • Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
  • Experience in building machine learning models for business application

Nice to have

  • Experience using Unix/Linux
  • Experience in professional software development

What the JD emphasized

  • develop new evaluation methods
  • novel automated evaluation methods
  • evaluation framework(s)

Other signals

  • develop new evaluation methods for multimodal language models and agents
  • invent and validate reliability for novel automated evaluation methods
  • develop and extend our evaluation framework(s)