Senior Product Manager – Rocm & Ai/ml Inference Software

AMD AMD · Semiconductors · Santa Clara, CA · Engineering

Senior Product Manager for AMD's ROCm open-source GPU software stack, focusing on large-scale AI/ML model inference. The role involves defining product strategy, engaging with the open-source community, partnering with engineering, and collaborating with customers and partners to optimize inference performance on AMD hardware. Requires deep familiarity with GPU computing, open-source software, and AI inference infrastructure, including frameworks like PyTorch/JAX and serving runtimes like vLLM.

What you'd actually do

  1. Own the product roadmap for ROCm’s inference software capabilities, including integrations with key frameworks like PyTorch and JAX, serving runtimes like vLLM and SGLang, and the libraries and profiling tools that make inference workloads perform well on AMD hardware.
  2. Define and communicate a coherent strategy for how AMD software enables production inference workloads, covering the full range from single-GPU to rack-scale deployments, with a focus on developer experience, performance portability, and competitive standing.
  3. Serve as AMD’s active presence in the open-source AI/ML community: monitor GitHub repositories, Discord servers, developer blogs, and academic papers to track emerging trends, pain points, and opportunities.
  4. Partner with engineering leads across inference libraries, kernel, runtime, and serving integration teams to translate developer and customer needs into prioritized engineering work.
  5. Balance the signal from large customers, who may need specific optimizations or bespoke integrations, against the needs of the broader open-source community, who value standards, portability, and low friction.

Skills

Required

  • Product strategy and roadmap ownership
  • AI/ML inference infrastructure
  • GPU computing
  • Open-source software engagement
  • Technical fluency
  • Community engagement
  • Engineering partnership
  • Customer and partner engagement
  • Deep familiarity with PyTorch, JAX, or Triton
  • Practical understanding of LLM inference
  • Hands-on familiarity with GPU programming (HIP, CUDA, or Triton kernels)
  • Strong written communication skills

Nice to have

  • Experience at a GPU vendor, cloud provider AI infrastructure team, or frontier AI lab
  • Experience contributing to or managing products in an open-source context

What the JD emphasized

  • production inference workloads
  • inference at scale
  • open-source community
  • large language models
  • LLM inference
  • serving performance at scale
  • open-source context

Other signals

  • driving strategy and execution for ROCm, AMD’s open-source GPU software stack
  • focus on large-scale model inference on AMD Instinct™ and Radeon™ hardware
  • intersection of the open-source community, high-performance computing, and production AI deployment
  • customers include some of the organizations deploying and serving the largest language models in the world