Sr. Director Data Center GPU Platform & System Validation GPU

AMD AMD · Semiconductors · Austin, TX · Engineering

This role is for a Sr. Director of Data Center GPU Platform & System Validation at AMD. The primary focus is on leading teams to validate next-generation ML/AI/HPC products, ensuring system-level testing, BKC delivery, and driving test innovation. The role involves close collaboration with various engineering teams and customers developing AI Cloud Services and on-prem AI Clusters. While the company and its products are heavily involved in AI, the role itself is in validation and testing of the underlying hardware and systems, not directly building AI models or agents.

What you'd actually do

  1. Lead post-silicon validation planning & execution of System Integration testing, BKC Delivery and other test functions
  2. Build and cultivate strong relationships with key engineering organizations including firmware, software, networking, cluster test, and other validation organizations
  3. Recruit top talent to support the team's growth plans and develop the existing engineering team
  4. Drive test innovation and strategic initiatives based on learning from prior programs and your own best practice knowledge
  5. Partner with peers to drive quality enhancements through methodology improvements and collaboration of manufacturing testing

Skills

Required

  • Bachelors or Masters degree in electrical or computer engineering
  • Extensive experience working in Silicon/System/Cluster validation
  • Strong knowledge of silicon/system test methodologies
  • Prior roles leading other managers and teams of 100+ people
  • Demonstrated strength in developing and growing technical teams
  • Understanding of HW/FW/SW interaction and system-level design
  • Strong communication and collaboration skills
  • Demonstrated ability in “showing initiative” to get things done
  • Excellent communication skills including presenting to senior executives

Nice to have

  • Experience in the ML/AI space

What the JD emphasized

  • System Level testing
  • ML/AI/HPC products
  • System and Cluster experience
  • highly complex Clusters/Systems