What you'd actually do

Develop world-class GPU-accelerated AI inference serving software.

Contribute to feature development and drive broad customer adoption.

Drive the convergence of the Triton Inference Server and NVIDIA Dynamo stacks to establish a unified, high-performance inference platform.

Be an active member of the open source deep learning software engineering community.

Balance a variety of objectives such as building robust software designed to be deployed in production server or cloud environments, optimizing and balancing prediction throughput and latency, and developing and adopting the next generation of inference technologies.

Skills

Required

MS or PhD in Computer Science or relevant field (or equivalent experience).
5+ years of professional experience working on deep learning software.
Excellent Rust & C++ skills, familiarity with Python, and strong programming & software design skills including debugging, performance analysis, and test design.
Experience with high-scale distributed systems and ML systems.
Strong communication skills and ability to work in a fast-paced, agile team environment.

Nice to have

Prior experience with AI frameworks and engines, such as TensorRT, PyTorch, ONNX, OpenVINO, vLLM, or TRT-LLM.
Knowledge of GPU memory management, cache management, or high-performance networking.
Experience with distributed systems programming.
Experience in contributing to a large open source project: use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.

Other signals

Develop world-class GPU-accelerated AI inference serving software.

Drive the convergence of the Triton Inference Server and NVIDIA Dynamo stacks to establish a unified, high-performance inference platform.

We are looking for a Senior System Software Engineer to work onDynamo-Triton Inference Server. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building a highly-performant AI inference platform to make design and deployment of new AI models easier and accessible to all users.

What you'll be doing:

Develop world-class GPU-accelerated AI inference serving software.
Contribute to feature development and drive broad customer adoption.
Drive the convergence of the Triton Inference Server and NVIDIA Dynamo stacks to establish a unified, high-performance inference platform. This platform will ensure feature parity and effectively serve both Large Language Model (LLM) and non-LLM workloads.
Be an active member of theopen source deep learning software engineering community.
Balance a variety of objectives such as building robust software designed to be deployed in production server or cloud environments, optimizing and balancing prediction throughput and latency, and developing and adopting the next generation of inference technologies.

What we need to see:

MS or PhD in Computer Science or relevant field (or equivalent experience).
5+ years of professional experience working on deep learning software.
Excellent Rust & C++ skills, familiarity with Python, and strong programming & software design skills including debugging, performance analysis, and test design.
Experience with high-scale distributed systems and ML systems.
Strong communication skills and ability to work in a fast-paced, agile team environment.

Ways to stand out from the crowd:

Prior experience with AI frameworks and engines, such as TensorRT, PyTorch, ONNX, OpenVINO, vLLM, or TRT-LLM.
Knowledge of GPU memory management, cache management, or high-performance networking.
Experience with distributed systems programming.
Experience in contributing to a large open source project: use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 1, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.