Senior Soc Systems Software Engineer, Annapurna Labs Machine Learning Accelerators, Aws

Amazon Amazon · Big Tech · Cupertino, CA · Software Development

This role focuses on developing the low-level software stack (drivers, runtime libraries, communication software) for custom ML accelerators (SoCs) designed by Amazon's Annapurna Labs. It involves working at the hardware-software boundary, pre-silicon and post-silicon, on SoC models, validation, and performance optimization. While the chips are for ML training and inference, the role itself is systems software engineering, not directly building ML models or agents.

What you'd actually do

  1. Develop and own components of our SoC models, both single-chip and at the datacenter-scale level
  2. Debug complex hardware/software interactions across the full software stack — from register-level bring-up on functional models and emulators, to performance analysis on live silicon
  3. Collaborate with chip architects, RTL designers, modelers, compiler engineers, and ML framework teams to co-design and validate the hardware/software interface
  4. Contribute to the design of hardware features by providing a software perspective early in the chip development cycle
  5. Build tooling, test infrastructure, and automation that accelerates development for yourself and your teammates

Skills

Required

  • 6+ years of full software development life cycle
  • Experience as a mentor, tech lead or leading an engineering team
  • 7+ years of professional experience developing firmware, drivers, runtime software, or low-level systems software for custom hardware (SoCs, ASICs, GPUs, CPUs, FPGAs)
  • Experience programming in C++, Python, and/or Rust
  • Knowledge of SoC, CPU, GPU, and/or ASIC architecture and micro-architecture

Nice to have

  • Experience with collective communication libraries or distributed systems primitives (MPI, NCCL, RCCL, or similar)
  • Experience debugging using functional models, QEMU, FPGA, or emulators
  • Experience with Linux kernel development, device drivers, or bare-metal firmware
  • Experience building functional or performance models of SoCs
  • Experience co-designing hardware/software interfaces with architecture or RTL teams
  • Familiarity with PCIe, DMA engines, on-chip interconnects, or network-on-chip architectures
  • Experience with performance profiling and optimization of latency-sensitive software
  • Experience with multi-threaded, multi-process, or asynchronous programming models

What the JD emphasized

  • no machine learning background is needed for this role
  • Any ML knowledge required can be learned on the job
  • what matters is your ability to write great low-level software and reason about hardware