Soc Systems Software Engineer, Annapurna Labs Machine Learning Accelerators, Aws

Amazon Amazon · Big Tech · Cupertino, CA · Software Development

This role focuses on developing the low-level software stack (drivers, runtime libraries, communication software) for custom ML accelerators (SoCs). The engineer will work on SoC models, debug hardware/software interactions, and build tooling to support chip development and validation. While the chips are for ML, the role itself is systems software engineering, not ML model development.

What you'd actually do

  1. Develop and own components of our SoC models, both single-chip and at the datacenter-scale level
  2. Debug complex hardware/software interactions across the full software stack — from register-level bring-up on functional models and emulators, to performance analysis on live silicon
  3. Collaborate with chip architects, RTL designers, modelers, compiler engineers, and ML framework teams to co-design and validate the hardware/software interface
  4. Contribute to the design of hardware features by providing a software perspective early in the chip development cycle
  5. Build tooling, test infrastructure, and automation that accelerates development for yourself and your teammates

Skills

Required

  • Knowledge of hardware architectures
  • 2+ years of professional experience developing firmware, drivers, runtime software, or low-level systems software for custom hardware (SoCs, ASICs, GPUs, CPUs, FPGAs)
  • Experience programming in C++
  • Experience programming in Python
  • Experience programming in Rust

Nice to have

  • Experience with collective communication libraries or distributed systems primitives (MPI, NCCL, RCCL, or similar)
  • Experience debugging using functional models, QEMU, FPGA, or emulators
  • Experience with Linux kernel development, device drivers, or bare-metal firmware
  • Experience building functional or performance models of SoCs
  • Familiarity with PCIe, DMA engines, on-chip interconnects, or network-on-chip architectures
  • Experience with performance profiling and optimization of latency-sensitive software
  • Experience with multi-threaded, multi-process, or asynchronous programming models

What the JD emphasized

  • no machine learning background is needed
  • Any ML knowledge required can be learned on the job
  • what matters is your ability to write great low-level software and reason about hardware