Senior Developer Technology Engineer - AI

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

Senior Developer Technology Engineer focused on researching and optimizing AI/ML workloads for GPU acceleration, involving deep analysis, performance tuning, and collaboration with the developer community and internal teams to influence next-generation hardware and software design.

What you'd actually do

  1. In this position, you will research and develop techniques to GPU accelerate workloads in deep learning, machine learning or other AI domains.
  2. Work directly with other technical experts in their fields (industry and academia) to perform in-depth analysis and optimization of complex AI and HPC algorithms to ensure the best possible AI solutions on modern CPU and GPU architectures.
  3. Publish and present discovered optimization techniques in developer blogs or relevant conferences to engage and educate the Developer community.
  4. Influence the design of next-generation hardware architectures, software, and programming models in collaboration with research, hardware, system software, libraries, and tools teams at NVIDIA.

Skills

Required

  • Masters degree in Computer Science, Computer Engineering, or related computationally focused science degree (or additional equivalent experience)
  • 8+ years of relevant work experience or research
  • Programming fluency in C/C++
  • Deep understanding of algorithms and software development
  • Parallel programming (CUDA, OpenACC, OpenMP, MPI, pthreads, etc.)
  • Hands on experience doing low-level performance optimizations
  • In-depth expertise with CPU and GPU architecture fundamentals
  • Good communication and organization skills
  • Logical approach to problem solving
  • Prioritization skills

Nice to have

  • Expertise in parallelization and performance optimization of Deep Learning models arising from Natural Language Processing, Computer Vision, Recommender Systems, etc.
  • Excellent understanding of linear algebra

What the JD emphasized

  • 8+ years of relevant work experience or research
  • Hands on experience doing low-level performance optimizations
  • In-depth expertise with CPU and GPU architecture fundamentals

Other signals

  • GPU acceleration
  • performance optimization
  • parallel algorithms
  • deep learning
  • machine learning