Machine Learning Compiler Engineer

Amazon Amazon · Big Tech · Cupertino, CA · Software Development

The Machine Learning Compiler Engineer will work on the Amazon Neuron team to develop and scale a deep learning compiler stack for Amazon's custom ML accelerators (Inferentia and Trainium). This role involves optimizing neural network models for inference and training performance, integrating with ML frameworks, and contributing to the software stack that enables large-scale ML workloads. The engineer will be involved in pre-silicon design and bringing new features to market.

What you'd actually do

  1. Supporting the ground-up development and scaling of a compiler to handle the world's largest ML workloads.
  2. Architecting and implementing business-critical features.
  3. Publish cutting-edge research.
  4. Contributing to a brilliant team of experienced engineers.
  5. Leverage your technical communications skill as a hands-on partner to AWS ML services teams.
  6. Involved in pre-silicon design, bringing new products/features to market, and many other exciting projects.

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience programming with at least one software programming language

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Background in Machine Learning and AI accelerators is preferred

What the JD emphasized

  • world's largest ML workloads
  • scaling
  • pre-silicon design
  • new products/features to market

Other signals

  • ML compiler
  • ML accelerators
  • inference performance
  • training performance
  • Amazon Neuron SDK
  • PyTorch
  • TensorFlow
  • MxNet
  • optimize performance
  • deep learning compiler stack
  • neural network descriptions
  • toolchain
  • quantum leap in performance
  • world's largest ML workloads
  • scaling
  • pre-silicon design
  • new products/features to market