Fellow, AI Workload Optimization

AMD AMD · Semiconductors · Bellevue, WA +1 · Engineering

Fellow, AI Software (Workload Optimization) at AMD. This role focuses on defining and driving the end-to-end software optimization strategy for AI workloads on AMD hardware, ensuring industry-leading performance. Responsibilities include leading profiling, analysis, and tuning of large-scale models, partnering with customers, and collaborating on hardware-software co-design. The role requires deep expertise in AI frameworks, ROCm, and performance optimization techniques.

What you'd actually do

  1. Set the technical vision and roadmap for workload optimization across the AI software stack, ensuring AMD remains the platform of choice for top-tier AI customers.
  2. Lead the profiling, analysis, and tuning of large-scale models (LLMs, Diffusion, Multimodal, and MoE) to ensure "out-of-the-box" performance excellence on AMD hardware.
  3. Partner with top customers and hyperscalers to understand their unique workload requirements and deliver tailored architectural wins and software optimizations.
  4. Collaborate across hardware architecture, compiler, and framework teams to influence future silicon features based on evolving AI workload trends.
  5. Drive the development of advanced tools and frameworks for performance estimation, modeling, and automated reporting.

Skills

Required

  • 15+ years of software development experience with at least 5 years in a high-level technical leadership role (Fellow or equivalent).
  • Deep expertise in AI Frameworks (PyTorch, JAX, vLLM, SGLang) and the ROCm software stack.
  • Proven history of optimizing distributed inference and training at scale across multi-node/multi-GPU environments.
  • Mastery of performance profiling tools (e.g., TorchProfiler, ROCm Profiler, Nsight) and hardware-level performance modeling.
  • Strong understanding of modern model architectures (Transformer, Attention, KV Cache) and optimization techniques like quantization, speculative decoding, and FlashAttention.
  • Demonstrated ability to drive cross-functional initiatives in fast-paced, ambiguous environments.
  • PhD or Master’s degree in Computer Science, Electrical Engineering, or a related field, or equivalent experience.
  • Demonstrated research or applied experience in AI/ML, including areas such as deep learning, model training/inference optimization, large language models, or computer vision.

Nice to have

  • AI hardware architecture
  • software optimization
  • customer engagement
  • technical leadership
  • compiler optimization
  • framework optimization
  • performance estimation tools
  • performance modeling tools
  • automated reporting tools

What the JD emphasized

  • industry-leading performance
  • top-tier customers
  • critical performance needs
  • out-of-the-box performance excellence
  • top customers and hyperscalers
  • evolving AI workload trends
  • performance estimation, modeling, and automated reporting

Other signals

  • performance optimization
  • customer engagement
  • hardware-software co-design