Senior Developer Technology Engineer - … at NVIDIA

What you'd actually do

Work closely with internal engineering and product teams and external app developers on solving local end-to-end AI GPU deployment challenges on the NVIDIA RTX AI platform.

Apply powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.

Conduct hands-on trainings, develop sample code and host presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance on NVIDIA ARM-based SoCs.

Improve Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like GGML, Llama.cpp, Ollama, ONNX Runtime.

Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.

Skills

Required

5+ years of professional experience in local GPU deployment, profiling and optimization
C/C++
Python
software design
programming techniques
Windows operating system development experience
open-source LLM and GenAI software experience
CUDA
NVIDIA's Nsight GPU profiling and debugging suite
problem-solving skills
independent and collaborative work
interpersonal and communication skills

Nice to have

GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT
Vulkan and / or DX12
latest generation GPU architectures
AI deployment on NPUs and ARM architectures

At NVIDIA, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

As a Developer Technology Engineer, you will be at the forefront of innovation, working with leading industry partners and exciting OSS projects to help them adopt groundbreaking advancements in AI and accelerated computing on NVIDIA RTX. This role offers an outstanding opportunity to collaborate with world-class talent and make a significant contribution to the next era of enterprise and consumer AI.

What you'll be doing:

Work closely with internal engineering and product teams and external app developers on solving local end-to-end AI GPU deployment challenges on the NVIDIA RTX AI platform.
Apply powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.
Conduct hands-on trainings, develop sample code and host presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance on NVIDIA ARM-based SoCs.
Improve Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like GGML, Llama.cpp, Ollama, ONNX Runtime.
Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.
Providing technical leadership and mentorship to junior engineers, encouraging an inclusive and high-performing team environment.

What we need to see:

A proven track record of 5+ years of professional experience in local GPU deployment, profiling and optimization.
Bachelor's or Master's degree or equivalent experience in Computer Science, Engineering, or a related field.
Strong proficiency in C/C++, Python, software design, programming techniques..
Familiarity with and development experience on the Windows operating system.
Experience working with open-source LLM and GenAI software.
Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite.
Some travel is required for conferences and for on-site visits with external partners.
Strong problem-solving skills and the ability to work both independently and collaboratively in a fast-paced environment.
Excellent interpersonal and communication skills and a passion for keeping track with the latest advancements in AI technology.

Ways to stand out from the crowd:

Experience with GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT.
Confirmed expert knowledge in Vulkan and / or DX12.
Detailed knowledge of the latest generation GPU architectures.
Experience with AI deployment on NPUs and ARM architectures.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, autonomous and love a challenge, we want to hear from you.