Sr. Sdm, AI Inference Technology, Neuron Sdk

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Senior Manager for AI Inference Technology, leading a team to build fundamental inference technology building blocks and libraries for AWS Neuron SDK, optimizing models for Trainium and Inferentia devices. Focuses on the full development life cycle of inference libraries, enabling customers to optimize LLMs, multimodal, and generative models.

What you'd actually do

  1. lead a strong team of managers and engineers to build fundamental inference technology building blocks and libraries to enable AI developers to optimize model for inference on Trainium and Inferentia devices
  2. responsible for the full development life cycle of inference library and feature development, including reliability and scalability
  3. develop the Neuronx_Distributed Inference Libraries and contribute to other popular open source Inference Libraries, enabling customers to optimize LLMs, multimodal, and generative models
  4. work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers
  5. build massive-scale distributed training and inference solutions, developing the full stack of software, servers and chips together with teams across the Annapurna organization to run the largest machine learning workloads

Skills

Required

  • 10+ years of engineering experience
  • 5+ years of engineering team management experience
  • 10+ years of planning, designing, developing and delivering consumer software experience
  • Experience partnering with product or program management teams
  • Experience managing multiple concurrent programs, projects and development teams in an Agile environment

Nice to have

  • Experience partnering with product and program management teams
  • Experience designing and developing large scale, high-traffic applications

What the JD emphasized

  • inference acceleration
  • AWS Neuron
  • Trainium
  • Inferentia
  • inference library
  • distributed inference libraries
  • optimize LLMs, multimodal, and generative models

Other signals

  • inference acceleration
  • AWS Neuron
  • Trainium
  • Inferentia
  • distributed inference libraries
  • optimize LLMs, multimodal, and generative models