AI Framework Engineer

AMD AMD · Semiconductors · Shanghai, China · Engineering

AI Framework Engineer at AMD responsible for optimizing deep learning frameworks for AMD GPUs, enhancing GPU kernels, and improving training/inference performance on multi-GPU and multi-node systems. The role involves optimizing distributed inference and RL solutions using frameworks like vLLM and SGlang, collaborating with internal GPU library teams and open-source maintainers, and leveraging compiler technologies.

What you'd actually do

  1. Build and optimize end to end distributed inference (e.g, P/D disaggregation and Large-EP) and RL solutions on mainstream frameworks like vLLM and SGlang.
  2. Work closely with internal teams to analyze and improve training and inference performance on AMD GPUs.
  3. Engage with framework maintainers to ensure code changes are aligned with requirements and integrated upstream.
  4. Optimize deep learning performance on both scale-up (multi-GPU) and scale-out (multi-node) systems.
  5. Leverage advanced compiler technologies to improve deep learning performance.

Skills

Required

  • C++ development
  • Linux environments
  • Python
  • debugging
  • performance tuning
  • test design
  • large-scale workloads on heterogeneous computing clusters
  • compiler theory
  • LLVM
  • ROCm

Nice to have

  • integrating optimized GPU performance into machine learning and LLM frameworks
  • scaling and throughput
  • Text to Video or Image to Video

What the JD emphasized

  • optimizing deep learning frameworks
  • enhancing GPU kernels
  • training/inference performance
  • multi-GPU and multi-node systems
  • distributed inference
  • RL solutions
  • vLLM
  • SGlang
  • compiler technologies

Other signals

  • optimizing deep learning frameworks
  • enhancing GPU kernels
  • training/inference performance
  • multi-GPU and multi-node systems
  • distributed inference
  • RL solutions
  • vLLM
  • SGlang
  • compiler technologies