Senior Software Development Engineer - Ai/ml, Aws Neuron

Amazon Amazon · Big Tech · Cupertino, CA · Software Development

Senior Software Development Engineer for AWS Neuron, focusing on accelerating deep learning and GenAI workloads on custom ML accelerators (Inferentia and Trainium). The role involves optimizing inference performance for LLMs, working across the stack from frameworks to hardware-software boundaries, and collaborating with compiler, runtime, and hardware teams. Key responsibilities include designing, developing, and optimizing ML models and frameworks, building infrastructure for model onboarding, implementing low-level optimizations, and working with customers on model enablement.

What you'd actually do

  1. Design, develop, and optimize machine learning models and frameworks for deployment on custom ML hardware accelerators.
  2. Participate in all stages of the ML system development lifecycle including distributed computing based architecture design, implementation, performance profiling, hardware-specific optimizations, testing and production deployment.
  3. Build infrastructure to systematically analyze and onboard multiple models with diverse architecture.
  4. Design and implement high-performance kernels and features for ML operations, leveraging the Neuron architecture and programming models
  5. Analyze and optimize system-level performance across multiple generations of Neuron hardware

Skills

Required

  • Python
  • System level programming
  • ML knowledge
  • low-level optimization
  • system architecture
  • ML model acceleration
  • performance profiling
  • hardware-specific optimizations
  • distributed computing
  • inference performance optimization
  • latency and throughput optimization

Nice to have

  • PyTorch
  • JAX
  • compiler
  • runtime
  • kernels
  • GenAI
  • LLM model families
  • Llama family
  • DeepSeek

What the JD emphasized

  • critical to this role
  • must have
  • required

Other signals

  • AWS Neuron SDK
  • ML accelerators
  • inference performance
  • LLM model families
  • distributed inference solutions