Deep Learning Performance Architect

NVIDIA · Semiconductors · Shanghai, China +1

NVIDIA is seeking Software Engineers to join their Deep Learning Inference team, focusing on developing and optimizing GPU-accelerated deep learning kernels for inference. The role involves performance analysis, tuning, and collaboration with cross-functional teams on innovative solutions.

What you'd actually do

  1. Develop highly optimized deep learning kernels for inference
  2. Do performance optimization, analysis, and tuning
  3. Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions
  4. Occasionally travel to conferences and customers for technical consultation and training

Skills

Required

  • C/C++ programming
  • software design
  • Performance modelling
  • profiling
  • debug
  • code optimization
  • architectural knowledge of CPU and GPU

Nice to have

  • Python
  • SW Agile skills
  • GPU programming experience (CUDA or OpenCL)
  • Deep learning kernels
  • image understanding
  • speech understanding

What the JD emphasized

  • ability to work in a fast-paced customer-oriented team is required
  • excellent communication skills are necessary
  • Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
  • 4 years of relevant work experience

Other signals

  • inference
  • optimization
  • GPU
  • deep learning kernels