Senior AI Compiler Engineer, Mlir

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

NVIDIA is hiring a Senior AI Compiler Engineer to build an MLIR-based AI compiler for their inference engine, focusing on performance, low memory usage, and usability across data center and edge. The role involves developing graph representations, optimizations, defining APIs, and implementing compiler optimizations and kernel generation for neural networks.

What you'd actually do

  1. Develop MLIR-based graph representations and optimizations for future GPU architectures.
  2. Partner with framework and hardware teams to enable new model patterns and upcoming GPU architectural features.
  3. Define APIs and MLIR dialects, conduct performance optimizations and analysis, implement compiler optimizations and kernel generation for neural networks, and contribute to other general software engineering work.

Skills

Required

  • Bachelor's, Master's, or Ph.D. in Computer Science, Computer Engineering, a related field, or equivalent experience.
  • 3+ years of relevant work or research experience in performance analysis and compiler optimizations.
  • Experience with compiler technologies such as MLIR, XLA, and LLVM.
  • Excellent C/C++ and Python programming and software design skills, including debugging, performance analysis, and testing.
  • Ability to work independently, define project goals and scope, and lead your own development efforts.
  • Strong interpersonal skills and the ability to thrive in a fast-moving, dynamic, product-oriented team.

Nice to have

  • Understanding of deep learning models, algorithms, and frameworks such as PyTorch and JAX.
  • Experience with GPU kernel generation targeting high performance and fast build times.
  • Proficiency in GPU architecture with CUDA or OpenCL programming experience.
  • A track record of mentoring early career engineers and interns is a bonus

What the JD emphasized

  • performance
  • fast builds
  • low memory use
  • compiler optimizations
  • kernel generation

Other signals

  • MLIR-based AI compiler
  • inference engine
  • performance optimizations
  • kernel generation