Deep Learning Performance Architect

NVIDIA NVIDIA · Semiconductors · Bangalore, India +1

NVIDIA is seeking a Deep Learning Performance Architect to develop innovative hardware architectures for AI workloads, focusing on parallel computing performance and energy efficiency. The role involves benchmarking, analyzing AI workloads, and developing tools for profiling and debugging parallel applications.

What you'd actually do

  1. Develop innovative HW architectures to extend the state of the art in parallel computing performance and energy efficiency.
  2. Benchmark and analyze AI workloads in single and multi-node configurations.
  3. Develop tools to profile, analyze and debug parallel applications in Python/C++.
  4. Work closely with peer architecture teams and product management to guide development of the products.
  5. Keep abreast with emerging trends and research in deep learning.

Skills

Required

  • C
  • C++
  • Python

Nice to have

  • GPU computing
  • parallel programming
  • modern transformer-based model architectures
  • architecture simulator development
  • performance modeling
  • profiling
  • analysis

What the JD emphasized

  • B.Tech. or M.Tech. in a relevant discipline (CS, EE, Math).
  • 1+ years of experience in C, C++ and Python.

Other signals

  • Develop innovative HW architectures to extend the state of the art in parallel computing performance and energy efficiency.
  • Benchmark and analyze AI workloads in single and multi-node configurations.
  • Develop tools to profile, analyze and debug parallel applications in Python/C++.