Sr. Applied Scientist, Alexa International

Amazon Amazon · Big Tech · London, United Kingdom · Machine Learning Science

Senior Applied Scientist role focused on developing and advancing multilingual speech models (understanding and generation), text-to-speech synthesis, and speech-to-speech models for Alexa International. The role involves driving scientific strategy, leveraging large-scale computing resources, and optimizing model performance for production deployment in low-resource language settings.

What you'd actually do

  1. develop novel algorithms and modeling techniques to advance the state of the art in multilingual speech generation, text-to-speech synthesis, and speech-to-speech models
  2. leverage Amazon's heterogeneous data sources and large-scale computing resources to accelerate advances in speech synthesis, voice quality, and pronunciation accuracy for non-English locales
  3. collaborate with software engineers to optimize model performance for production deployment
  4. partner with product managers to align research priorities with customer needs
  5. drive cross-team scientific strategy for speech quality across international locales

Skills

Required

  • Master's degree, or PhD
  • Experience programming in Java, C++, Python or related language
  • Experience with neural deep learning methods and machine learning
  • Experience in building machine learning models for business application
  • Experience in applied research
  • solid understanding of machine learning
  • solid understanding of speech synthesis (TTS/S2S)
  • solid understanding of multilingual phonetics
  • solid understanding of modern model architectures
  • solid understanding of evaluation methodology

Nice to have

  • Experience with modeling tools such as R, scikit-learn, Spark MLLib, MxNet, Tensorflow, numpy, scipy etc.
  • Experience with large scale distributed systems such as Hadoop, Spark etc.

What the JD emphasized

  • speech models
  • multilingual systems
  • speech-to-speech models
  • multilingual speech generation
  • text-to-speech synthesis
  • speech-to-speech models
  • low-resource language settings

Other signals

  • multilingual systems
  • speech-to-speech models
  • text-to-speech synthesis
  • low-resource language settings
  • speech quality