Senior Compute Kernel Architect, GPU Power

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

This role focuses on designing and optimizing CUDA kernels to enhance GPU power consumption and stress the Power Delivery Network (PDN). It involves collaborating with hardware architects, developing power stress microbenchmarks, and analyzing trade-offs between kernel throughput, power efficiency, and voltage stability. The position requires strong CUDA, C++, and Python programming skills, along with a solid understanding of GPU architecture and PDNs.

What you'd actually do

  1. Design and develop CUDA kernels purpose-built to enhance GPU power consumption — targeting worst-case current draw across compute, memory, and I/O subsystems
  2. Collaborate with hardware power architects to validate PDN assumptions, di/dt specs to appropriately target weak points
  3. Build and maintain a library of power stress microbenchmarks that sweep power profiles across GPU functional units — tensor cores, memory controllers, I/O interfaces — to stress PDN resonance and droop conditions across GPU families
  4. Analyze trade-offs between kernel throughput, power efficiency, and voltage stability, contributing insights that feed directly into future GPU architecture decisions
  5. Partner across teams — GPU architects, power circuit designers, silicon validation engineers — to ensure power stress methodologies are aligned from pre-silicon simulation through post-silicon bringup

Skills

Required

  • MS or PhD in Computer Science, Electrical Engineering, or Computer Engineering (or equivalent experience)
  • 5+ years of experience in GPU kernel development, CUDA programming, or high-performance computing
  • Strong CUDA and C++ programming skills
  • Experience writing and optimizing kernels at the assembly or PTX level
  • Experience with GPU performance profiling tools (Nsight Compute, Nsight Systems, nvprof, or equivalent)
  • Solid understanding of GPU architecture
  • Working knowledge of Power Delivery Networks (PDNs)
  • Conceptual understanding of di/dt
  • Strong programming skills in Python for scripting, data analysis, and automation
  • Excellent communication skills

Nice to have

  • Hands-on experience writing GPU power stress microbenchmarks
  • Direct experience with post-silicon power characterization
  • Experience with DVFS, AVFS, and noise mitigation features
  • Knowledge of PDN impedance targets

What the JD emphasized

  • 5+ years of experience in GPU kernel development, CUDA programming, or high-performance computing
  • Strong CUDA and C++ programming skills, with hands-on experience writing and optimizing kernels at the assembly or PTX level
  • Solid understanding of GPU architecture
  • Working knowledge of Power Delivery Networks (PDNs)
  • Conceptual understanding of di/dt