Compiler Engineer - AI Inference

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is seeking an AI Compiler Engineer to optimize kernel generation and computational graph optimizations for AI inference and training workloads on next-generation GPUs. The role involves hands-on development, collaboration on hardware/software co-design, and scaling AI deployments in datacenters.

What you'd actually do

  1. Participating in hands-on development focusing on kernel generation and computational graph optimizations for next-generation NVIDIA GPUs.
  2. Solve complex compilation problems for AI workloads (both inference and training) and successfully transition these breakthroughs into enterprise and consumer products.
  3. Partner with leading experts across our software, hardware, and research divisions to architect and co-design future silicon.
  4. Participating in the advancement and optimization of datacenter-scale AI workload deployments.

Skills

Required

  • BS or MS in Computer Science, Computer Engineering, or a related field (or equivalent experience)
  • 3+ years of relevant industry experience specializing in compiler optimizations, synthesis, and placement
  • Demonstrated, hands-on experience working with MLIR
  • Exceptional C/C++ and Python programming and software design skills
  • Rigorous debugging, performance analysis, and test design

Nice to have

  • PhD
  • Hands-on experience implementing complex AI workloads on CPU, GPU, and/or custom AI accelerator architectures
  • Deep understanding of Large Language Model (LLM) inference and its profound implications on computer architecture
  • Demonstrated understanding in the designing and architecting of comprehensive compiler frameworks from the ground up

What the JD emphasized

  • compiler optimizations
  • MLIR
  • LLM inference

Other signals

  • AI inference performance
  • next-generation NVIDIA GPUs
  • AI workloads
  • datacenter-scale AI workload deployments