Senior Applied Deep Learning Research Scientist, Efficiency

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1

Research Scientist at NVIDIA focused on making deep learning models more efficient through techniques like quantization, sparsity, and optimized architectures. The role involves researching low-bit representations, pruning, and developing new algorithms for both training and inference, with a focus on understanding the root causes of efficiency gains and losses. The work directly influences next-generation hardware and state-of-the-art models, with opportunities for open-sourcing or publishing findings.

What you'd actually do

  1. Research of low-bit number representations and pruning and their effect on neural network inference and training accuracy. This includes requirements by the existing state of art neural networks, as well as co-design of future neural network architectures and optimizers.
  2. Innovate with new algorithms to make deep learning more efficient while retaining accuracy, and open-source or publish these algorithms for the world to use.
  3. Run large-scale deep learning experiments to prove out ideas and analyze the effects of efficiency improvements.
  4. Collaborate across the company with teams making the hardware, software and deep learning architectures.

Skills

Required

  • PhD degree in AI, computer science, computer engineering, math or a related field or equivalent experience
  • 5+ years of relevant industrial research experience
  • Familiarity with state-of-art neural network architectures, optimizers and LLM training
  • Experience with modern DL training frameworks and/or inference engines
  • Fluency in Python, and solid coding/software-engineering practices

Nice to have

  • Experience in quantization, pruning, numerics and efficient architectures
  • A background in computer architecture
  • Experience with GPU computing, kernels, CUDA programming and/or performance analysis

What the JD emphasized

  • proven track-record in publications
  • run large-scale experiments
  • strong interest in neural network efficiency

Other signals

  • optimizing neural networks for training and deployment
  • co-design of future neural network architectures
  • develop new technologies for efficiency
  • make deep learning faster and consume less energy