Principal Perception Engineer, Obstacle Foundation Models - Autonomous Vehicles

at NVIDIA · Industrial · Santa Clara, CA

Principal Perception Engineer at NVIDIA for Autonomous Vehicles, focusing on designing and productizing next-generation 3D obstacle perception stacks using deep learning, transformers, and multi-modal techniques. The role involves technical leadership, hands-on algorithm development, production-grade model development, data strategy, and collaboration with safety and systems teams for large-scale deployment.

What you'd actually do

  1. Own the technical vision, architecture, and roadmap for 3D obstacle perception to support end-to-end autonomous driving functionalities, leveraging state-of-the-art CNN and transformer-based architectures where appropriate.
  2. Design and develop advanced 3D perception models using multi-camera inputs and/or multi-sensor fusion (camera, radar, lidar) for obstacle detection and tracking, including opportunities to explore BEV and transformer-based 3D perception.
  3. Lead the development of efficient, production-grade deep learning models: define objectives, select architectures, guide experimentation, and establish best practices for training and evaluation, using techniques such as large-scale pretraining, distillation, and parameter-efficient fine-tuning (e.g., LoRA).
  4. Define and drive KPI frameworks to quantify perception performance; analyze large-scale real and synthetic datasets to identify failure modes and systematically improve accuracy, robustness, and efficiency, incorporating modern approaches like self-supervised and representation learning when beneficial.
  5. Lead data strategy for perception: specify data and labeling requirements, prioritize data collection and annotation, and collaborate closely with data and ground-truth teams to maximize impact, including model-assisted workflows (e.g., active learning, auto-labeling, VLMs) and advanced model-in-the-loop tooling.

Skills

Required

  • Python
  • C++
  • PyTorch
  • deep learning
  • perception systems
  • data-driven development
  • technical leadership
  • architecture
  • algorithm development
  • production-grade software development

Nice to have

  • autonomous driving
  • robotics
  • CNNs
  • transformers
  • multi-camera input
  • multi-sensor fusion
  • BEV perception
  • large-scale pretraining
  • distillation
  • parameter-efficient fine-tuning
  • LoRA
  • self-supervised learning
  • representation learning
  • model-assisted workflows
  • active learning
  • auto-labeling
  • VLMs
  • model-in-the-loop tooling
  • embedded platforms
  • real-time platforms
  • optimization for latency
  • memory optimization
  • compute constraints
  • vision-language models
  • 3D computer vision fundamentals
  • camera modeling and calibration
  • multi-view geometry
  • 3D representations
  • transformer-based 3D perception
  • BEV perception pipelines
  • CUDA development
  • GPU-accelerated components
  • custom CUDA kernels
  • publication record

What the JD emphasized

  • 15+ years of hands-on experience developing deep learning–based perception or closely related systems for complex real-world problems
  • track record of taking models from prototype to production
  • Demonstrated technical leadership as a senior or principal-level individual contributor
  • owning features or subsystems end-to-end
  • setting technical direction
  • making architectural decisions
  • coordinating across teams
  • Proven experience in data-driven development
  • close collaboration with data, labeling, and ground-truth teams on data strategy, labeling quality, and iterative model improvement
  • Strong publication record or recognized contributions in deep learning, computer vision, or autonomous systems at leading conferences/journals (e.g., CVPR, ICCV, NeurIPS, IROS)

Other signals

  • leading the design and productization of next-generation autonomous driving perception stack
  • drive cross-functional execution
  • deeply hands-on with architecture, algorithms, and implementation
  • modern transformer-based, multi-modal, and vision-language techniques
  • production-grade deep learning models
  • large-scale pretraining, distillation, and parameter-efficient fine-tuning
  • Define and drive KPI frameworks to quantify perception performance
  • analyze large-scale real and synthetic datasets
  • Lead data strategy for perception
  • model-assisted workflows
  • partner with safety, systems, and software teams
  • stringent product requirements for safety, latency, resource usage, and software robustness
  • deployment at scale
Read full job description

Intelligent machines powered by artificial intelligence—computers that can learn, reason, and interact with people—are transforming every industry. GPU-accelerated deep learning provides the foundation for machines to perceive, reason, and solve complex problems. NVIDIA GPUs run deep learning algorithms that simulate aspects of human intelligence, acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.

We are seeking an exceptional Principal Perception Engineer to lead the design and productization of NVIDIA’s next-generation autonomous driving perception stack. This is a senior individual contributor role with broad technical leadership. You will set the technical direction for 3D obstacle perception, drive cross-functional execution, and mentor other engineers, while remaining deeply hands-on with architecture, algorithms, and implementation, including modern transformer-based, multi-modal, and vision-language techniques where they add real value.

What you’ll be doing:

  • Own the technical vision, architecture, and roadmap for 3D obstacle perception to support end-to-end autonomous driving functionalities, leveraging state-of-the-art CNN and transformer-based architectures where appropriate.
  • Design and develop advanced 3D perception models using multi-camera inputs and/or multi-sensor fusion (camera, radar, lidar) for obstacle detection and tracking, including opportunities to explore BEV and transformer-based 3D perception.
  • Lead the development of efficient, production-grade deep learning models: define objectives, select architectures, guide experimentation, and establish best practices for training and evaluation, using techniques such as large-scale pretraining, distillation, and parameter-efficient fine-tuning (e.g., LoRA).
  • Define and drive KPI frameworks to quantify perception performance; analyze large-scale real and synthetic datasets to identify failure modes and systematically improve accuracy, robustness, and efficiency, incorporating modern approaches like self-supervised and representation learning when beneficial.
  • Lead data strategy for perception: specify data and labeling requirements, prioritize data collection and annotation, and collaborate closely with data and ground-truth teams to maximize impact, including model-assisted workflows (e.g., active learning, auto-labeling, VLMs) and advanced model-in-the-loop tooling.
  • Partner with safety, systems, and software teams to ensure perception solutions meet stringent product requirements for safety, latency, resource usage, and software robustness, and are ready for deployment at scale.
  • Provide technical leadership and mentorship to other engineers, influencing design and implementation across the broader perception and autonomy teams.

What we need to see:

  • 15+ years of hands-on experience developing deep learning–based perception or closely related systems for complex real-world problems, with strong proficiency in frameworks such as PyTorch and a track record of taking models from prototype to production.
  • Demonstrated technical leadership as a senior or principal-level individual contributor: owning features or subsystems end-to-end, setting technical direction, making architectural decisions, and coordinating across teams.
  • Proven experience in data-driven development, including close collaboration with data, labeling, and ground-truth teams on data strategy, labeling quality, and iterative model improvement.
  • Strong programming skills in Python and/or C++, with a history of building reliable, high-performance, production-quality software.
  • Excellent communication and collaboration skills, with the ability to influence, align, and drive consensus across multidisciplinary teams.
  • BS/MS/PhD in Computer Science, Electrical Engineering, or related fields (or equivalent experience).

Ways to stand out from the crowd:

  • Proven track record leading the design and deployment of perception solutions for autonomous driving or robotics using camera-based deep learning at scale.
  • Hands-on experience architecting and deploying DNN-based perception pipelines on embedded or real-time platforms, including optimization for latency, memory, and compute constraints, and experience with modern architectures such as CNNs and transformers, plus familiarity with techniques like large-scale pretraining, parameter-efficient fine-tuning (e.g., LoRA), or vision-language models (VLMs).
  • Strong publication record or recognized contributions in deep learning, computer vision, or autonomous systems at leading conferences/journals (e.g., CVPR, ICCV, NeurIPS, IROS).
  • Deep understanding of 3D computer vision fundamentals, including camera modeling and calibration (intrinsic and extrinsic), multi-view geometry, and 3D representations, ideally with experience applying these concepts in transformer-based 3D or BEV perception pipelines.
  • Experience with CUDA development and optimizing training or inference pipelines through custom CUDA kernels or other GPU-accelerated components.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until March 16, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.