Validation Data Engineer, Verification and Validation - Autonomous Vehicles

NVIDIA NVIDIA · Semiconductors · Shanghai, China

NVIDIA is seeking a Validation Data Engineer for its Autonomous Vehicles team. The role involves building tooling, performing large-scale analysis, and driving data-driven evaluation of vehicle-level behavior and ODD coverage using real-world and virtual AV driving logs. Responsibilities include implementing evaluation frameworks, data pipelines, and data curation strategies, defining core metrics, and contributing to scalable workflows using cloud platforms and AI. The ideal candidate has 5+ years of experience in data engineering or analytics, strong Python skills, and experience with autonomous vehicle behavior analysis.

What you'd actually do

  1. In this role, you will work on large‑scale driving behavior and ODD analysis using extensive real‑world and virtual AV driving logs to evaluate safety, comfort, and overall vehicle‑level performance.
  2. Implement and improve evaluation frameworks, data pipelines, and data curation strategies to support robust analysis across thousands of test miles every single day.
  3. Define and compute core metrics that quantify AV performance against target ODDs, powering our product development flywheel, technical reviews, and AV software releases.
  4. Contribute to scalable workflows that use cloud platforms, modern data engineering tools, and AI workflows to surface insights, spot regressions, and enable data‑driven decision making.
  5. Work closely with our Software Product, Testing and Development teams to turn open-ended safety and performance questions into clear quantitative analyses.

Skills

Required

  • MS or PhD in Computer Science, Mathematics, Statistics, Electrical/Computer Engineering, or a related quantitative field, or equivalent experience.
  • 5+ years of proven experience in data engineering or analytics roles working with large‑scale data
  • Experience analyzing behavior of autonomous vehicles, ADAS systems, or other safety‑critical cyber‑physical systems.
  • Strong Python skills, including writing production‑quality code and libraries for data processing, analysis, and automation.
  • Hands‑on experience building and operating data pipelines in a production environment with cloud computing platforms.
  • Excellent communication and teamwork skills, with a track record of working across teams and presenting your findings to technical collaborators.
  • Ability to create clear dashboards, visualizations, and concise summaries for different audiences.

Nice to have

  • data science experience is a plus
  • Background in statistics including experimental design, hypothesis testing, confidence intervals, and explaining results for non‑experts.
  • Experience designing and scaling data and ML/AI pipelines to process and analyze very large telemetry or log datasets.
  • Experience with GPU‑accelerated and/or distributed computing for large‑scale data processing and model evaluation.
  • Familiarity with simulation‑based validation, vehicle‑level testing, and interpreting fleet test or on‑road validation data.
  • Experience contributing to technical direction for a data or analytics team, such as helping define metrics, validation methods, or coding guidelines.

What the JD emphasized

  • large-scale analysis
  • data-driven evaluation
  • evaluation frameworks
  • data pipelines
  • AI workflows
  • data engineering
  • large-scale data
  • data and ML/AI pipelines
  • large-scale data processing

Other signals

  • large-scale analysis
  • data-driven evaluation
  • data pipelines
  • evaluation frameworks
  • metrics