Software Engineer ML Acceleration, Annapurna Labs ML Acceleration System Software

Amazon Amazon · Big Tech · Austin, TX · Software Development

This role is for a Software Engineer on the Machine Learning Server Software Team at Amazon's Annapurna Labs. The team focuses on hardware/software co-design for ML servers, developing software for server components, integration into EC2, and supporting manufacturing and qualification processes. The role involves working with C/C++, Python, and Lua to create maintainable, documented, tested, and reusable software for the physical systems that execute ML algorithms, rather than the algorithms themselves. It emphasizes system programming, device drivers, and high-speed interfaces like PCIe.

What you'd actually do

  1. Member of a team responsible for the software associated with server components and integration in to EC2.
  2. Working with the MLA Hardware, Test and Manufacturing teams to create a coordinated software package to enable both qualification as well as rapid deployment of software.
  3. Developing software (C/C++, Python, Lua) which can be maintained, improved upon, documented, tested, and reused.

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience programming with at least one software programming language

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Interest in cloud-scale computer hardware
  • Knowledge of system programming concepts including device-drivers, device trees, and Linux system programming.
  • Interest in high speed computer interfaces including PCIe and memory subsystems.
  • Experience writing software for server or computer card manufacturing.