Deep Learning Kernel Software Performance Architect - New College Grad 2026

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is seeking a Deep Learning Kernel Software Performance Architect to develop and analyze processor and system architectures that accelerate machine learning and data analytics applications. The role involves debugging deep learning software, developing analysis tools, and collaborating with various NVIDIA teams to optimize performance.

What you'd actually do

  1. Validate and analyze performance of GPU-accelerated system and software architectures that advance the frontier of deep learning performance.
  2. Debug deep learning and data analytics software to identify root causes of performance bottlenecks.
  3. Develop scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites
  4. Work with the CUDA and AI Compiler teams to pinpoint and resolve performance issues
  5. Engage AI/ML training and inference performance teams to identify and optimize critical deep learning layers

Skills

Required

  • software design
  • debugging
  • performance analysis
  • test development
  • parallel programming
  • computer architecture
  • performance debugging
  • Python
  • C
  • C++

Nice to have

  • machine learning fundamentals
  • deep learning fundamentals
  • high performance power efficient designs
  • energy efficient high-performance computing
  • performance profiling
  • GPU computing
  • parallel programming models
  • analytical performance modeling

What the JD emphasized

  • performance analysis
  • debugging
  • performance bottlenecks
  • deep learning performance

Other signals

  • GPU architecture
  • deep learning performance
  • software performance analysis
  • debugging performance bottlenecks