Senior Software Development Engineer, Ec2 Nitro

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Senior Software Development Engineer role focused on building and scaling EC2 compute platforms for machine learning workloads, including training and inference for LLMs and multimodal systems. The role involves designing innovative technologies, leading technical projects, developing regression testing systems, and collaborating with hardware teams to optimize platform designs for ML performance.

What you'd actually do

  1. Design and develop innovative technologies that power the infrastructure supporting machine learning workloads on Ultraservers
  2. Lead technical projects establishing EC2 as the pioneer in cloud computing for ML workloads across diverse applications including LLMs, multimodal systems, and emerging model architectures.
  3. Develop and maintain comprehensive regression testing systems that validate performance across major component releases including frameworks, firmware, drivers, and networking infrastructure.
  4. Collaborate with hardware engineering teams to influence future platform designs based on performance insights gathered from state-of-the-art research and customer workloads.
  5. Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices through multiple channels.

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team
  • C, C++ or Rust development in a Linux environment
  • Linux package management
  • version control systems
  • automated build processes
  • software unit testing

Nice to have

  • In-depth knowledge of ML frameworks
  • cluster management
  • full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience in embedded development in C/C++

What the JD emphasized

  • ML frameworks
  • cluster management
  • high-performance training and inference

Other signals

  • ML workloads
  • Ultraservers
  • cloud computing for ML
  • LLMs
  • multimodal systems
  • emerging model architectures
  • performance insights
  • high-performance training and inference