Deep Learning Performance Software Engineer

NVIDIA NVIDIA · Semiconductors · Shanghai, China

Develops GPU-accelerated deep learning software, including compilers, DSLs, and optimized kernels, for current and next-generation chips, focusing on performance analysis of AI workloads and integration with AI frameworks.

What you'd actually do

  1. Develop compilers and DSLs for deep learning workloads
  2. Design and implement highly optimized deep learning kernels
  3. Continuously improve the compiler architecture for current and next generation chips
  4. Perform performance analysis on emerging AI workloads and integrate with AI frameworks

Skills

Required

  • C/C++ programming
  • software design skills
  • XLA
  • TVM
  • MLIR
  • LLVM
  • deep learning models
  • deep learning algorithms

Nice to have

  • Master's or Ph.D degree
  • 3+ years of relevant work experience

What the JD emphasized

  • customer-oriented team is required
  • excellent communication skills are necessary

Other signals

  • GPU-accelerated Deep learning software
  • deep learning kernels
  • AI frameworks