Senior Research Engineer - Autonomous Vehicles

NVIDIA · Semiconductors · Santa Clara, CA

Senior Research Engineer at NVIDIA focusing on AI for Autonomous Vehicles. The role involves developing large-scale training frameworks for multimodal foundation models, optimizing GPU utilization, implementing data loaders, building simulation infrastructure, integrating new architectures, developing sim-to-real pipelines, combining LLMs with policy learning, and applying RL for fine-tuning LLMs. Requires expertise in deep learning, reinforcement learning, generative modeling, distributed training systems, and GPU acceleration.

What you'd actually do

Develop large-scale supervised learning and reinforcement learning training frameworks to support multi-modal foundation models for AVs capable of running on thousands of GPUs;
Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets;
Implement scalable data loaders and preprocessors tailored for multimodal datasets, including videos, text, and sensor data;
Build and optimize simulation infrastructure (based on GPU-accelerated simulators) to support the training of driving policies for AVs at scale;
Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines.

Skills

Required

Deep learning
Reinforcement learning
Generative modeling
Software engineering
Large-scale model training
Distributed training systems
PyTorch, JAX, or TensorFlow
PPO, SAC, or Q-learning
Reward shaping, domain randomization, curriculum learning
GPU acceleration
CUDA programming
Kubernetes
Python
C++
HPC environments
Job scheduling/orchestration tools

Nice to have

multimodal foundation models for AVs
sim-to-real transfer pipelines
LLMs with policy learning
fine-tuning multimodal LLMs
SLURM

What the JD emphasized

strong expertise in software engineering and in artificial intelligence topics
strong programming skills
solid track record of training deep learning models at scale
good mathematical foundation to analyze new AI algorithms
AI models for autonomous driving such as agent behavior models, end-to-end AV architectures, AI safety, closed-loop training approaches, and AV foundation models (VLMs, reasoning models, etc.)
publishing at top venues
working with the broader scientific community
Communicating with different teams and domain scientists in different areas is essential
aid fundamental research with the freedom and bandwidth to conduct ground-breaking publishable research
impact products and collaborate with teams that focus on AI products
Develop large-scale supervised learning and reinforcement learning training frameworks
Optimize GPU and cluster utilization for efficient model training and fine-tuning
Implement scalable data loaders and preprocessors tailored for multimodal datasets
Build and optimize simulation infrastructure
Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines
Develop sim-to-real transfer pipelines
Apply reinforcement learning to finetune multimodal LLMs
Develop robust monitoring and debugging tools
10+ years of full-time industry experience in large-scale MLOps and AI infrastructure
Proven experience designing and optimizing distributed training systems
Deep familiarity with reinforcement learning algorithms
Deep understanding of GPU acceleration, CUDA programming, and cluster management tools
Strong programming skills in Python and a high-performance language such as C++
Strong experience with large-scale GPU clusters, HPC environments, and job scheduling/orchestration tools

Other signals

develop large-scale supervised learning and reinforcement learning training frameworks
optimize GPU and cluster utilization for efficient model training and fine-tuning
implement scalable data loaders and preprocessors tailored for multimodal datasets
build and optimize simulation infrastructure
collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines
develop sim-to-real transfer pipelines
propose scalable solutions that combine LLMs with policy learning
apply reinforcement learning to finetune multimodal LLMs
develop robust monitoring and debugging tools

Apply on company site

● Active

Posted 6mo ago · last seen 1w ago · 163 days open

AI score: 9/10
Stage: Post-train Agent
Compensation: $184k–$357k
Location: Santa Clara, CA
Role: Senior · Researcher
Function: Research
Domain: robotics
Team: Autonomous Vehicles Research
Maturity: Scaling

Skills

Agents & Autonomy

Agentic Systems

Applied ML Domains

Autonomous DrivingComputer GraphicsHealthcare & Biomedical ML

Computer Vision & Multimodal

Computer Vision

Data Engineering

Data Pipelines

Frameworks & Tools

CUDA

General Experience & Skills

Software Engineering

Infrastructure & Systems

Compiler DesignGPU ComputingGPU Kernel DevelopmentTraining Infrastructure

LLM & Foundation Models

AI SafetyChain-of-Thought ReasoningFoundation ModelsGenerative AIVision-Language Models

Languages

Python

ML Ops & Evaluation

Fine-TuningProduction ML Systems

ML Techniques

Machine LearningModel Post-TrainingMultimodal LearningOptimization MethodsPerceptionReinforcement Learning (RL)Simulation

NLP & Language

Natural Language Processing

Research & Credentials

Published Research

Read full job description

We are recruiting top research engineers in the Autonomous Vehicles Research team at NVIDIA with strong expertise in software engineering and in artificial intelligence topics, such as deep learning, reinforcement learning, and generative modeling. You must have strong programming skills, a solid track record of training deep learning models at scale, and a good mathematical foundation to analyze new AI algorithms. We focus on AI models for autonomous driving such as agent behavior models, end-to-end AV architectures, AI safety, closed-loop training approaches, and AV foundation models (VLMs, reasoning models, etc.). We will be publishing at top venues and working with the broader scientific community. Communicating with different teams and domain scientists in different areas is essential.

The position will aid fundamental research with the freedom and bandwidth to conduct ground-breaking publishable research. At the same time, you will also have the opportunity to impact products and collaborate with teams that focus on AI products based on CUDA, physically-based simulation, graphics, natural language processing, autonomous driving, HW optimization, robotics, healthcare, and many more. NVIDIA has an open and nurturing atmosphere for research that encourages collaboration.

What you will be doing:

Develop large-scale supervised learning and reinforcement learning training frameworks to support multi-modal foundation models for AVs capable of running on thousands of GPUs;
Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets;
Implement scalable data loaders and preprocessors tailored for multimodal datasets, including videos, text, and sensor data;
Build and optimize simulation infrastructure (based on GPU-accelerated simulators) to support the training of driving policies for AVs at scale;
Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines.
Develop sim-to-real transfer pipelines and work closely with the AV product team to deploy to real-world cars;
Propose scalable solutions that combine LLMs with policy learning.
Apply reinforcement learning to finetune multimodal LLMs.
Develop robust monitoring and debugging tools to ensure the reliability and performance of training workflows on large GPU clusters.

What we need to see:

Bachelor's degree in Computer Science, Robotics, Engineering, or a related field or equivalent experience.
10+ years of full-time industry experience in large-scale MLOps and AI infrastructure.
Proven experience designing and optimizing distributed training systems with frameworks like PyTorch, JAX, or TensorFlow.
Deep familiarity with reinforcement learning algorithms like PPO, SAC, or Q-learning, including experience tuning hyperparameters and reward functions.
Familiarity with common policy learning techniques like reward shaping, domain randomization, curriculum learning.
Deep understanding of GPU acceleration, CUDA programming, and cluster management tools like Kubernetes.
Strong programming skills in Python and a high-performance language such as C++ for efficient system development.
Strong experience with large-scale GPU clusters, HPC environments, and job scheduling/orchestration tools (e.g., SLURM, Kubernetes).

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and autonomous research scientist with a genuine passion for advancing the state of AI? If so, we want to hear from you!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until January 13, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.