Sr. Software Engineer, Efa Network ML Software Team - Annapurna Labs

Amazon Amazon · Big Tech · Seattle, WA · Software Development

This role is for a Sr. Software Engineer on the EFA Network ML Software Team at Amazon. The team owns the user-space software for the Elastic Fabric Adapter (EFA) network card, enabling customers to network thousands of GPU and CPU instances for ML and HPC workloads. The engineer will write high-performance C code for open-source projects, invent new networking APIs, and provide expert support to AI customers. The role involves leading design and architecture, focusing on performance, low latency, and high bandwidth in clustered environments.

What you'd actually do

  1. You will help lead a team of obsessed networking developers operating at the highest levels in networking.
  2. You will write the highest-performing code in C for multiple open source projects supporting EFA, such as Libfabric and Open MPI.
  3. You will work with multiple teams in the stack to invent new APIs for the latest concepts in networking in the cloud.
  4. Dive deep into how your customers are doing collectives and messaging at high bandwidth and low latency.
  5. Provide expert-level support to some of the biggest names in AI in the world.

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team
  • 5+ years of professional experience programming in C

Nice to have

  • Bachelor's degree in computer science or equivalent

What the JD emphasized

  • highest-performing code in C
  • 5+ years of professional experience programming in C