What you'd actually do

You will develop and optimize software solutions to accelerate LLM inference using GPU technology.

Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.

Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.

Engage in both individual and team projects, contributing to NVIDIA's mission of leading the AI revolution.

Work in an empowering and inclusive environment to successfully implement groundbreaking AI solutions.

Join NVIDIA, a leader in advancing computer graphics, PC gaming, and accelerated computing for over 25 years. As an LLM Inference Software Engineer, you will be at the forefront of innovative AI technology, working on the ground-breaking TRTLLM project. This role offers you the exceptional opportunity to accelerate LLM inference using GPU technology, influencing everything from single PCs to clusters with thousands of powerful GPUs. Be part of a team that values creativity, cooperation, and the pursuit of excellence.

What you'll be doing:

You will develop and optimize software solutions to accelerate LLM inference using GPU technology.
Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
Engage in both individual and team projects, contributing to NVIDIA's mission of leading the AI revolution.
Work in an empowering and inclusive environment to successfully implement groundbreaking AI solutions.

What we need to see:

5+ working years' experience in software engineering, particularly in GPU programming and LLM inference.
Strong proficiency in programming languages such as Python, C++, and CUDA.
A solid understanding of deep learning frameworks and techniques.
Outstanding problem-solving skills and the ability to work collaboratively in a team setting.
Ambitious approach with a proven track record of taking initiative and delivering results.
BS or above degree in Computer Science, Engineering, or a related field, or equivalent experience.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

What you'll be doing:

You will develop and optimize software solutions to accelerate LLM inference using GPU technology.
Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
Engage in both individual and team projects, contributing to NVIDIA's mission of leading the AI revolution.
Work in an empowering and inclusive environment to successfully implement groundbreaking AI solutions.

What we need to see:

5+ working years' experience in software engineering, particularly in GPU programming and LLM inference.
Strong proficiency in programming languages such as Python, C++, and CUDA.
A solid understanding of deep learning frameworks and techniques.
Outstanding problem-solving skills and the ability to work collaboratively in a team setting.
Ambitious approach with a proven track record of taking initiative and delivering results.
BS or above degree in Computer Science, Engineering, or a related field, or equivalent experience.

Compute Architecture Software Engineer

What you'd actually do

Skills

Required

What the JD emphasized

Other signals