Data Center GPU Performance Attainment Lead

AMD AMD · Semiconductors · Austin, TX · Engineering

This role focuses on post-silicon performance characterization and optimization of AMD Datacenter GPUs, specifically for HPC and ML workloads. The candidate will develop automation frameworks, define power and performance, execute test plans, analyze scaling efficiency, identify and mitigate bottlenecks, and collaborate with cross-functional teams to drive performance closure. Experience with Python, C/C++, scripting, computer architecture, and HPC/ML workloads is preferred.

What you'd actually do

  1. Develop and maintain automation frameworks for workload execution and performance data collection, enabling scalable and repeatable characterization across configurations.
  2. Become a key stakeholder in the product power and performance definition process, ensuring alignment between architectural goals and measured silicon performance.
  3. Develop, execute, and evolve performance characterization and optimization test plans across diverse usage scenarios, including High Performance Computing (HPC) and Machine Learning (ML) workloads.
  4. Drive performance attainment for both scale-up (intra-node) and scale-out (multi-node) configurations, including:
  5. Analyze interactions between power management features and performance behavior, optimizing configurations to achieve the best performance and performance-per-watt tradeoffs.

Skills

Required

  • computer architecture
  • HPC workloads
  • ML workloads
  • Python
  • C/C++
  • Shell scripting
  • performance tooling
  • automation workflows
  • scale-up and scale-out performance analysis
  • rack-level and cluster-level deployments

Nice to have

  • TensorFlow
  • PyTorch
  • mentoring junior engineers
  • coordinating cross-functional teams

What the JD emphasized

  • ML workloads
  • performance characterization and optimization

Other signals

  • performance characterization and optimization
  • ML workloads
  • automation frameworks
  • system-level bottlenecks
  • power management features