High-performance LLM Training Engineer - New College Grad 2026

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is seeking an experienced engineer to optimize LLM training workloads on high-performance computing systems, focusing on software stack optimization for thousands of GPUs and influencing future hardware roadmaps. The role involves performance analysis, profiling, and implementation across the deep learning platform, from drivers to frameworks, and contributing to MLPerf benchmarks.

What you'd actually do

  1. Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms.
  2. Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks.
  3. Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.
  4. Build and support NVIDIA submissions to the MLPerf Training benchmark suite.
  5. Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies.

Skills

Required

  • MS in Computer Science, Electrical Engineering or Computer Engineering (or equivalent experience)
  • Strong background in deep learning and neural networks, in particular training
  • A deep background in computer architecture and familiarity with the fundamentals of GPU architecture
  • Proven experience analyzing and tuning application performance & processor and system-level performance modeling
  • Programming skills in C++, Python, and CUDA

What the JD emphasized

  • high-performance LLM training workloads
  • high-performance training on thousands of GPUs
  • deep learning platform stack
  • deep learning framework

Other signals

  • optimizing LLM training workloads
  • high-performance training on thousands of GPUs
  • shaping hardware roadmaps for the next generation of GPUs