AI Computing Development Engineer, Tensorrt and Tensorrt-llm Aigv

NVIDIA NVIDIA · Semiconductors · Shanghai, China +2

NVIDIA is seeking software engineers to develop and optimize inferencing software (TensorRT/TensorRT-LLM) for AI computing. The role involves performance analysis, tuning, integrating AI advancements, and collaborating across teams to shape machine learning inferencing on NVIDIA platforms. Requires strong programming skills, experience with deep learning frameworks, and a proactive approach.

What you'd actually do

  1. Design and develop robust inferencing software (TensorRT/TensorRT-LLM) optimized for functionality and performance across platforms
  2. Perform performance analysis, optimization, and tuning of deep learning inference workloads
  3. Track and integrate academic and industry advancements in AI and feature-update TensorRT/TensorRT-LLM accordingly
  4. Provide feedback into architecture and hardware design and development
  5. Collaborate across hardware, software, and research teams to shape the direction of machine learning inferencing across NVIDIA platforms

Skills

Required

  • Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics, or related computing-focused field (or equivalent experience)
  • Strong Python or C/C++ programming and software design experience
  • debugging
  • performance profiling
  • test design
  • 2+ years working experience
  • Strong curiosity about artificial intelligence
  • familiarity with the latest developments in deep learning
  • Experience working with deep learning frameworks such as PyTorch, TensorRT/TensorRT-LLM, SGLang or vLLM
  • Proactive, self-driven
  • able to work independently
  • Excellent written and verbal communication skills in English
  • Demonstrated ability, commensurate with experience, to take technical ownership
  • solve complex problems
  • contribute effectively in cross-functional environments

Nice to have

  • generative models
  • multimodal systems
  • large neural networks

What the JD emphasized

  • delivery-focused environment is required
  • excellent interpersonal skills are a must
  • Strong curiosity about artificial intelligence
  • familiarity with the latest developments in deep learning
  • Proactive, self-driven
  • able to work independently
  • Excellent written and verbal communication skills in English
  • Demonstrated ability, commensurate with experience, to take technical ownership
  • solve complex problems
  • contribute effectively in cross-functional environments

Other signals

  • inference software
  • performance analysis
  • optimization
  • tuning
  • deep learning frameworks
  • GPU-accelerated AI