Software Dev Engineer, Machine Learning Compilers

Amazon Amazon · Big Tech · Sunnyvale, CA · Software Development

Software Development Engineer focused on building the compiler infrastructure and software stack for custom neural accelerator silicon designed for edge AI capabilities. The role involves optimizing deep learning workloads, model quantization, and compression for efficient execution on hardware with limited memory, ultimately enabling large AI models to run on edge devices.

What you'd actually do

  1. Design and develop software stack for deep learning accelerator
  2. Develop Compiler passes for graph ingestions, optimizations and partitioning.
  3. Develop backend code generation capabilities across heterogeneous platforms
  4. Profile, analyze and optimize system level performance, develop new tooling where necessary
  5. Successfully collaborate with hardware, software, applied science and product teams to onboard more and more user experiences to be powered by Deep Learning accelerator.

Skills

Required

  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 3+ years of non-internship professional software development experience
  • 3+ years of programming using a modern programming language such as Java, C++, or C#, including object-oriented design experience
  • Experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience in embedded development in C/C++
  • Experience building compiler for application specific accelerators or custom instruction set

What the JD emphasized

  • custom neural accelerator silicon
  • edge AI capabilities
  • compiler
  • LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware
  • custom instruction set

Other signals

  • custom neural accelerator silicon
  • edge AI capabilities
  • deep learning networks on edge processors
  • compiler infrastructure
  • lower deep learning workloads to heterogeneous device backends
  • model quantization and compression techniques
  • efficient execution on hardware
  • LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware