Senior Compiler Engineer, AI Inference Performance

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

NVIDIA is seeking a Senior Compiler Engineer to optimize AI inference performance for their Deep Learning & AI Compiler (DLC) team. The role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with framework and architecture teams to accelerate next-generation deep learning software for various AI applications.

What you'd actually do

  1. Analyzing deep learning networks and developing compiler optimization algorithms.
  2. Collaborating with members of the deep learning software framework teams and the GPU architecture teams to accelerate the next generation of deep learning software.
  3. Scope of these efforts includes defining public APIs, performance optimizations and analysis, crafting and implementing compiler techniques for AI workloads and future NVIDIA GPUs.

Skills

Required

  • performance analysis
  • compiler optimizations
  • compiler technologies (MLIR, LLVM, XLA, Triton)
  • C/C++
  • Python
  • software design
  • debugging
  • test design

Nice to have

  • CPU and/or GPU architecture
  • CUDA
  • OpenCL programming
  • deep learning models
  • algorithms
  • frameworks (PyTorch, JAX)
  • GPU kernel authoring
  • Nsight Compute
  • mentoring early-career engineers
  • new hardware bring-up

What the JD emphasized

  • 3+ years of relevant work or research experience in performance analysis and compiler optimizations
  • Experience with compiler technologies (e.g., MLIR, LLVM, XLA, Triton, etc.)
  • Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and test design.

Other signals

  • compiler optimization algorithms
  • inference performance
  • AI workloads