Post-silicon Systems Validation Engineer, Annapurna Labs

Amazon Amazon · Big Tech · CA, ON +1 · Software Development

This role focuses on validating next-generation machine learning accelerators for AWS cloud infrastructure. The engineer will be responsible for the complete vertical stack, from silicon to system-to-system interfaces, ensuring the quality and performance of AI/ML accelerators. This involves developing validation strategies, executing test plans, debugging hardware, and collaborating with various engineering teams throughout the product development lifecycle.

What you'd actually do

  1. Developing comprehensive validation strategies and detailed test plans covering functional, performance, power, and stress testing from silicon bring-up to product release
  2. Executing complex test plans from RTL simulation and emulation environments through physical silicon validation
  3. Conducting hands-on silicon bring-up and debug in the lab using oscilloscopes, logic analyzers, and protocol analyzers
  4. Validating ML accelerator performance, accuracy, and reliability using real-world neural network workloads
  5. Building test infrastructure, CI/CD, and automated regression frameworks to enable efficient validation at scale

Skills

Required

  • Strong programming skills (Python, Lua, C/C++, Rust, Go, etc)
  • A solid understanding of computer architecture
  • Experience with AWS services, cloud infrastructure, firmware development (BIOS, BMC, drivers)
  • Validation experience in any of these areas: PCIe, HBM, GPUs, neural networks, ML HW architecture, and/or CI/CD
  • Familiarity with the validation lifecycle from RTL simulation (SystemVerilog/UVM, VCS, Questa, Xcelium) and emulation (Palladium, Zebu, Veloce) through silicon failure analysis and debug
  • Experience with general troubleshooting/debugging of hardware, or experience in computer architecture
  • 3+ years of non-internship system test development, code reviews, source control management, build processes, automated deployments, and operations experience.
  • Experience with Linux environments and Git.

Nice to have

  • Experience with Machine Learning Hardware/Software Architecture
  • Experience with CI/CD
  • Experience with EDA Simulations or Emulation

What the JD emphasized

  • next-generation machine learning accelerators
  • AI training and inference
  • ML accelerator performance, accuracy, and reliability
  • ML workloads
  • AI/ML accelerators

Other signals

  • Validating ML accelerator performance, accuracy, and reliability using real-world neural network workloads
  • Validating the complete vertical stack—silicon, PCB, high-speed components (HBM, PCIe, chip-to-chip), inter-system connections, and system-to-system interfaces
  • Own critical validation aspects across the entire product development lifecycle—from early design validation through emulation, silicon bring-up, post-silicon validation, and ongoing support of production systems