Senior Virtual Platform Software Engineer, Annapurna Labs Machine Learning Accelerators, Aws

Amazon Amazon · Big Tech · Cupertino, CA · Software Development

This role focuses on building and owning virtual platforms (full-system C++ and SystemC models) for custom machine learning accelerator chips (Trainium and Inferentia). The virtual platform enables software teams to start development before silicon is available. The role involves developing models, improving platform infrastructure (QEMU integration, simulation performance), and supporting software development teams.

What you'd actually do

  1. Build and own functional models of SoC subsystems that integrate into our full-system virtual platform, used by firmware, driver, runtime, and application software teams
  2. Design models for usability and performance — your customers are software engineers who need to run real workloads on your platform efficiently
  3. Develop and improve the virtual platform infrastructure: QEMU integration, simulation performance, build and release tooling, and customer-facing documentation
  4. Work with software teams (your primary customers) to understand their workflows, debug issues on the platform, and shape the model to maximize their productivity
  5. Drive simulation performance improvements so the platform can handle increasingly complex workloads at scale

Skills

Required

  • 5+ years of non-internship professional software development experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team
  • 7+ years of non-internship professional experience writing functional or performance models
  • Experience programming with C++ and/or SystemC
  • Knowledge of SoC, CPU, GPU, and/or ASIC architecture and micro-architecture

Nice to have

  • Bachelor's degree in computer science or equivalent
  • Experience analyzing data and applying best practices to assess performance drivers
  • Experience developing models that integrate with QEMU
  • Experience developing and calibrating performance models for custom silicon chips
  • Experience with PyTest and GoogleTest
  • Familiarity with modern C++ (11, 14, etc.)
  • Experience in multi-threaded programming
  • Experience with machine learning accelerator hardware and/or software

What the JD emphasized

  • own a product that software teams across AWS depend on
  • engineering challenges are genuinely interesting
  • direct impact of your work
  • startup pace, big impact
  • own significant pieces of the stack