Senior Deep Learning Engineer

NVIDIA NVIDIA · Semiconductors · Redmond, WA +1

Senior Deep Learning Engineer at NVIDIA focused on optimizing inference for next-generation AI workloads including multi-agent systems and generative multimodal models. The role involves characterizing emerging workloads and developing novel optimization methods across the inference stack, from algorithmic to system level, on NVIDIA hardware. Collaboration with research, framework development, and silicon architecture teams is key.

What you'd actually do

  1. Continuously keeping up to date on the latest advancements in generative AI research.
  2. Analyzing and prototyping emerging workloads in multi-agent AI systems, generative multimodal models, and inference-time compute scaling.
  3. Pioneering and developing optimizations for these workloads across the inference stack to push the boundaries of inferencing quality and speed on NVIDIA systems.
  4. Collaborating closely with production teams to incorporate the latest advancements into cutting-edge software frameworks.

Skills

Required

  • Deep learning
  • Generative models
  • Inferencing
  • PyTorch
  • Software development

Nice to have

  • Published research
  • Agentic AI systems
  • Multimodal generation models
  • Computer architecture
  • Algorithms
  • Performance optimization

What the JD emphasized

  • at least 5 years of relevant software development experience in modern deep learning frameworks such as PyTorch
  • Published research or noteworthy contributions to the field of deep learning, particularly in areas such as inference-time compute, multimodal generation, AI systems, etc.
  • Experience with prototyping or deployment of agentic AI systems and/or multimodal generation models.

Other signals

  • optimizing inference for frontier workloads
  • multi-agent AI systems
  • generative multimodal models
  • inference-time compute scaling
  • optimizing inferencing engines, systems, and hardware architectures