Lead Dcgpu Performance Engineer

AMD AMD · Semiconductors · Bangalore, India · Engineering

Lead DCGPU Performance Engineer at AMD responsible for measuring and characterizing the performance of Data Center GPU systems for AI workloads, focusing on training and inference. The role involves establishing robust measurement methodologies, ensuring reproducibility, executing AI benchmarks, validating data accuracy, and enhancing tooling infrastructure. Collaboration with various teams and providing reliable performance insights for decision-making are key aspects.

What you'd actually do

  1. Measure performance of DCGPU systems across AI workloads (training, inference, microbenchmarks)
  2. Define and enforce best practices for reproducible performance measurement
  3. Execute AI workloads (LLMs, training, inference) with well-defined configurations
  4. Cross-check results for anomalies, inconsistencies, and measurement errors
  5. Develop and enhance tools for performance measurement, logging, and reporting

Skills

Required

  • performance measurement
  • benchmarking
  • system characterization
  • AI workloads (training and inference)
  • performance benchmarking methodologies
  • reproducibility practices
  • profiling and measurement tools (rocprof, Nsight, perf, etc.)
  • large-scale AI workloads (LLMs, distributed training, inference serving)
  • system-level performance variability
  • benchmarking challenges
  • Python
  • C/C++
  • scripting for automation
  • analytical mindset
  • attention to detail
  • data validation

Nice to have

  • building or maintaining benchmarking frameworks or infrastructure

What the JD emphasized

  • performance measurement
  • AI workloads
  • training
  • inference
  • reproducible performance measurement
  • performance measurement
  • AI workloads
  • training
  • inference
  • performance measurement
  • benchmarking
  • performance measurement
  • benchmarking

Other signals

  • performance measurement
  • AI workloads
  • training and inference
  • reproducibility
  • benchmarking