Compute Architecture Software Engineer

NVIDIA NVIDIA · Semiconductors · Shanghai, China

NVIDIA is seeking an LLM Inference Software Engineer to accelerate LLM inference using GPU technology on the TRTLLM project. The role involves developing and optimizing software solutions, implementing GPU-based algorithms, and improving performance across diverse computing environments.

What you'd actually do

  1. You will develop and optimize software solutions to accelerate LLM inference using GPU technology.
  2. Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
  3. Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
  4. Engage in both individual and team projects, contributing to NVIDIA's mission of leading the AI revolution.
  5. Work in an empowering and inclusive environment to successfully implement groundbreaking AI solutions.

Skills

Required

  • 5+ working years' experience in software engineering
  • GPU programming
  • LLM inference
  • Python
  • C++
  • CUDA
  • deep learning frameworks and techniques
  • problem-solving skills
  • collaborative team setting

What the JD emphasized

  • LLM inference
  • GPU programming
  • accelerate LLM inference

Other signals

  • LLM inference acceleration
  • GPU technology
  • TRTLLM project