Ai/hpc Cluster Thermal Design Engineer

AMD AMD · Semiconductors · Austin, TX · General Management/ Administration/ Support

This role focuses on the thermal design and cooling solutions for AI/HPC clusters and data center deployments. The engineer will be involved in evaluating, modeling, and validating cooling architectures, working closely with system architects and data center operations teams. The position requires fundamentals in heat transfer, thermodynamics, and fluid dynamics, with an interest in AI/HPC infrastructure.

What you'd actually do

  1. Support the thermal design of AI/HPC cluster solutions, including compute racks, cooling loops, and facility interfaces.
  2. Assist in evaluating cooling architectures (air cooling, direct liquid cooling, hybrid approaches) and identifying trade-offs in performance, cost, complexity, and reliability.
  3. Build and refine thermal and airflow models for system/cluster/data center concepts using industry tools (e.g., OpenFOAM, ANSYS, FloTHERM, or similar).
  4. Contribute to flow-network modeling for liquid cooling and coolant distribution analyses to ensure adequate flow, pressure, and temperature margins.
  5. Help define and execute test plans to validate thermal performance at component, system, and rack/cluster levels.

Skills

Required

  • fundamentals in heat transfer, thermodynamics, and fluid dynamics
  • thermal design of AI/HPC cluster solutions
  • evaluating cooling architectures
  • thermal and airflow modeling
  • flow-network modeling for liquid cooling
  • defining and executing test plans for thermal performance
  • technical documentation

Nice to have

  • electronics cooling concepts
  • data center or cluster thermal concepts
  • thermal/CFD simulation tools
  • measurement and validation practices
  • cross-functional engineering environments
  • PUE/WUE drivers
  • economizers/free cooling
  • waste-heat reuse concepts
  • coursework in heat transfer, thermodynamics, two-phase flow and heat transfer, refrigeration
  • projects, internships, or research related to HPC/AI infrastructure, data centers, or high-power electronics cooling