Sr. Machine Learning Compiler Engineer, Aws Neuron, Annapurna Labs

Amazon Amazon · Big Tech · Seattle, WA · Software Development

This role focuses on developing and scaling a machine learning compiler for AWS Neuron, which optimizes the performance of neural network models on custom AWS hardware accelerators (Inferentia and Trainium). The engineer will architect and implement features for the compiler stack, which integrates with popular ML frameworks, aiming to improve inference and training performance for large ML workloads.

What you'd actually do

  1. Architecting and implementing business-critical features
  2. publish cutting-edge research
  3. mentoring a brilliant team of experienced engineers
  4. leverage your technical communications skill as a hands-on partner to AWS ML services teams
  5. involved in pre-silicon design, bringing new products/features to market, and many other exciting projects

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team

Nice to have

  • Bachelor's degree in computer science or equivalent
  • Machine Learning
  • AI accelerators

What the JD emphasized

  • scaling of a compiler to handle the world's largest ML workloads
  • ground-up development
  • business-critical features
  • cutting-edge research
  • pre-silicon design
  • new products/features to market

Other signals

  • AWS Machine Learning accelerators
  • Inferentia chip
  • Trainium
  • AWS Neuron Software Development Kit (SDK)
  • ML compiler
  • runtime
  • PyTorch
  • TensorFlow
  • MxNet
  • optimize the performance of complex neural net models
  • deep learning compiler stack
  • toolchain
  • performance