Senior High Performance AI Engineer at NVIDIA

What you'd actually do

Design, build and optimize agentic AI systems for the CUDA ecosystem.

Co-design agentic system solutions with software, hardware and algorithm teams; influence and adopt new capabilities as they become available.

Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer productivity.

Collaborate across the AI stack—from hardware through compilers/toolchains, kernels/libraries, frameworks, distributed training, and inference/serving—and with model/agent teams.

Skills

Required

AI systems development
building foundational models, agents or orchestration frameworks
deep learning frameworks
modern inference stacks
C/C++
Python
software engineering fundamentals
GPU programming
performance optimization
CUDA

Nice to have

MS or PhD
optimizing and deploying high-performance models
resource-constrained platforms
benchmark wins or published results
Publications or open-source leadership in deep learning, multi-agent systems, reinforcement learning, or AI systems
contributions to widely used repos or standards

What the JD emphasized

building groundbreaking multi-agent systems for the CUDA ecosystem

innovative agentic runtimes and compiler-integrated orchestration

accelerate agent planning, tool-use, code generation

Strong C/C++ and Python programming skills

Experience with GPU programming and performance optimization (CUDA or equivalent)

Track record building/evaluating deep learning models, coding agents and developer tooling

Demonstrated ability to optimize and deploy high-performance models

Deep expertise in GPU performance optimizations

Publications or open-source leadership in deep learning, multi-agent systems, reinforcement learning, or AI systems

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are looking for outstanding Senior High Performance AI Engineer to build groundbreaking multi-agent systems for the CUDA ecosystem. We build innovative agentic runtimes and compiler-integrated orchestration that work together with NVIDIA's software stack to provide comprehensive acceleration for modern agent workloads powered by foundational models. As a member of the team, you will develop new agent abstractions, GPU-centric runtimes, and compiler- or runtime-driven system solutions to accelerate agent planning, tool-use, code generation, and other high-impact AI workloads. You will collaborate closely with internal NVIDIA software and hardware teams to push the latest developments into NVIDIA products.

What you'll be doing:

Design, build and optimize agentic AI systems for the CUDA ecosystem.
Co-design agentic system solutions with software, hardware and algorithm teams; influence and adopt new capabilities as they become available.
Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer productivity.
Collaborate across the AI stack—from hardware through compilers/toolchains, kernels/libraries, frameworks, distributed training, and inference/serving—and with model/agent teams.

What we need to see:

Bachelor’s degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); MS or PhD preferred.
6+ years of industry or academia experience with AI systems development; exposure to building foundational models, agents or orchestration frameworks; hands-on experience with deep learning frameworks and modern inference stacks.
Strong C/C++ and Python programming skills; solid software engineering fundamentals.
Experience with GPU programming and performance optimization (CUDA or equivalent).

Ways To Stand Out From The Crowd:

Track record building/evaluating deep learning models, coding agents and developer tooling.
Demonstrated ability to optimize and deploy high-performance models, including on resource-constrained platforms.
Deep expertise in GPU performance optimizations, evidenced by benchmark wins or published results.
Publications or open-source leadership in deep learning, multi-agent systems, reinforcement learning, or AI systems; contributions to widely used repos or standards.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 26, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.