Senior Deep Learning Research Engineer – Multimedia GPU Team

NVIDIA NVIDIA · Semiconductors · Bangalore, India +1

Senior Deep Learning Research Engineer focused on pioneering generative AI for audio and speech, involving research, design, training large-scale models, and productizing inference on NVIDIA GPUs. The role requires a PhD, extensive experience with generative models, strong Python skills, and a proven track record of shipped products or publications.

What you'd actually do

  1. Lead the research, design, and development of state-of-the-art deep learning models solving complex problems in audio and speech domains.
  2. Define the technical vision and R&D strategy for NVIDIA’s future Audio, Speech and multimedia DL algorithms, ensuring alignment with hardware advancements.
  3. Train and optimize large-scale generative models (Diffusion, Transformers, Autoregressive models) using distributed training across massive GPU clusters.
  4. Develop advanced algorithms for speech transformation, spatial audio, audio enhancements, and real-time video/audio enhancements using Deep Learning Algos.
  5. Develop and productize inference models on NVIDIA GPUs and Nvidia RTX Spark platforms as SDK/Microservices after optimization.

Skills

Required

  • Deep learning
  • Generative AI
  • Audio processing
  • Speech processing
  • Diffusion models
  • Transformers
  • PyTorch
  • Python
  • Software architecture
  • Data structures

Nice to have

  • Large-scale distributed training
  • C++
  • CUDA
  • cuDNN
  • Triton
  • TensorRT
  • World Models
  • Physics-informed neural networks
  • Video synthesis

What the JD emphasized

  • PhD in Computer Science, Artificial Intelligence, Applied Mathematics, or a related quantitative field.
  • 10+ years of industry or post-doc experience directly developing advanced deep learning models for audio, image, and video processing.
  • Deep Mastery of Generative AI
  • Elite Python coding skills
  • Expert-level hands-on experience with PyTorch
  • Proven Impact: A strong portfolio of shipped high-impact commercial AI products or a stellar publication record at top-tier AI conferences (CVPR, ICCV, SIGGRAPH, NeurIPS, ICASSP, Interspeech).

Other signals

  • Generative AI
  • Audio Foundation Models
  • Large-scale training
  • Multimedia AI