Senior AI Compiler Engineer, Algorithms and Code-generation

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +4 · Remote

NVIDIA is seeking a Senior AI Compiler Engineer to develop compiler optimization algorithms for AI workloads, focusing on delivering leading inference performance on GPUs. The role involves analyzing deep learning networks, optimizing compiler techniques, and working with CUDA and various compiler technologies.

What you'd actually do

  1. Analyzing deep learning networks and developing compiler optimization algorithms.
  2. Strong programming skills in CUDA including analyzing and debugging performance bottlenecks on GPUs
  3. Scope of these efforts includes defining public APIs, performance optimizations and analysis, crafting and implementing compiler techniques for AI workloads and future NVIDIA GPUs.

Skills

Required

  • Bachelor's, master's or Ph.D. in Computer Science, Computer Engineering, related field or equivalent experience.
  • 3+ years of relevant work or research experience in performance analysis and compiler optimizations.
  • Experience with compiler technologies (e.g., MLIR, LLVM, XLA, Triton, etc.).
  • Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and test design.
  • Ability to work independently, define project goals and scope, and lead your own development efforts.
  • Strong interpersonal skills

Nice to have

  • Proficient in CPU and/or GPU architecture especially modern Nvidia GPUs like Hopper and Blackwell
  • Understanding of deep learning models, algorithms, and frameworks, such as PyTorch, JAX.
  • GPU kernel authoring and performance analysis using tools such as Nsight Compute.
  • Track record of success in mentoring early-career engineers and interns
  • Track record on new hardware bring-up

What the JD emphasized

  • 3+ years of relevant work or research experience in performance analysis and compiler optimizations.
  • Experience with compiler technologies (e.g., MLIR, LLVM, XLA, Triton, etc.).
  • Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and test design.

Other signals

  • compiler optimization algorithms for AI workloads
  • inference performance
  • GPU architecture