Senior Manager, Machine Learning Ops Engineering - Automotive

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior Manager, Machine Learning Ops Engineering - Automotive at NVIDIA, leading the development and operation of large-scale data and ML pipelines for autonomous driving, focusing on data ingestion, processing, and validation for training and evaluation datasets.

What you'd actually do

  1. Lead and grow a high‑performing MLOps engineering group tasked with managing end‑to‑end data pipelines supporting NVIDIA’s autonomous driving technology from levels L2 through L4.
  2. Own the architecture, execution, and operational excellence of large‑scale, cloud‑native pipelines for multimodal sensor data ingestion, processing, labeling, and validation.
  3. Drive the development of robust, scalable, and observable MLOps systems that support model training, ground truth generation, and continuous evaluation at AV scale.
  4. Partner closely with perception, ML, data labeling, infrastructure, and product teams to translate customer and program requirements into reliable production systems.
  5. Define technical vision, roadmap, success metrics, and operational benchmarks, and ensure consistent execution against program achievements.

Skills

Required

  • Bachelor’s or equivalent experience, Master’s, or PhD in Computer Science, Electrical Engineering, or a closely related field (or equivalent experience).
  • 10+ overall years of overall engineering experience, including crafting and coordinating production‑grade distributed systems.
  • 5+ years of engineering management experience, with a proven history of guiding teams delivering sophisticated, large‑scale systems.
  • Strong background in MLOps, data pipelines, and cloud‑based distributed systems.
  • Proficiency in Python and C++, with the ability to guide system‑level and performance‑critical build decisions.
  • Experience crafting and operating end‑to‑end data or ML pipelines with high reliability, scale, and observability.
  • Prior experience in one or more of the following domains: Autonomous Vehicles, Robotics, Computer Vision, Deep Learning, or GPU‑accelerated computing.
  • Excellent communication and leadership skills, capable of aligning collaborators and driving execution in a multi-functional organization.
  • Demonstrated passion for ownership, accountability, and engineering that prioritizes customers.

Nice to have

  • Experience developing and leading AV‑scale data platforms handling petabyte‑scale sensor data.
  • Strong background of leading teams responsible for production MLOps or data infrastructure.
  • Experience with automotive or robotic systems, including real‑world sensor data pipelines.
  • Background in distributed cloud systems, workflow orchestration, and large‑scale CI/CD.
  • Familiarity with 3D geometry, perception pipelines, or data generation based on simulated environments.

What the JD emphasized

  • end‑to‑end data and ML pipelines
  • large‑scale
  • cloud‑scale pipelines
  • high‑quality training, evaluation, and validation datasets
  • customer‑focused development
  • scale systems and teams
  • end‑to‑end data pipelines
  • large‑scale, cloud‑native pipelines
  • multimodal sensor data ingestion, processing, labeling, and validation
  • robust, scalable, and observable MLOps systems
  • model training, ground truth generation, and continuous evaluation
  • AV scale
  • customer and program requirements
  • technical vision, roadmap, success metrics, and operational benchmarks
  • customer‑first thinking and ownership
  • measurable value to internal and external AV customers
  • hands‑on technical depth with people leadership
  • technical guidance, mentorship, and career development
  • multiple layers of the stack
  • Python, C++, distributed systems, cloud infrastructure, CI/CD, and data platforms
  • overall years of overall engineering experience
  • crafting and coordinating production‑grade distributed systems
  • engineering management experience
  • guiding teams delivering sophisticated, large‑scale systems
  • MLOps, data pipelines, and cloud‑based distributed systems
  • Python and C++
  • system‑level and performance‑critical build decisions
  • crafting and operating end‑to‑end data or ML pipelines
  • high reliability, scale, and observability
  • Autonomous Vehicles, Robotics, Computer Vision, Deep Learning, or GPU‑accelerated computing
  • communication and leadership skills
  • aligning collaborators and driving execution
  • multi‑functional organization
  • ownership, accountability, and engineering that prioritizes customers
  • AV‑scale data platforms
  • petabyte‑scale sensor data
  • production MLOps or data infrastructure
  • automotive or robotic systems
  • real‑world sensor data pipelines
  • distributed cloud systems, workflow orchestration, and large‑scale CI/CD
  • 3D geometry, perception pipelines, or data generation based on simulated environments
  • highly competitive compensation
  • extensive benefits plan
  • artificial intelligence
  • autonomous vehicles
  • science‑fiction technologies
  • autonomous driving
  • scaling complex systems
  • teams that deliver real customer impact
  • base salary
  • location, experience, and the pay of employees in similar positions
  • equity and benefits
  • existing vacancy
  • AI tools in its recruiting processes
  • diverse work environment
  • equal opportunity employer
  • highly value diversity
  • current and future employees
  • do not discriminate
  • hiring and promotion practices
  • race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law

Other signals

  • lead the build, development, and operation of large‑scale, end‑to‑end data and ML pipelines
  • own the architecture, execution, and operational excellence of large‑scale, cloud‑native pipelines for multimodal sensor data ingestion, processing, labeling, and validation
  • drive the development of robust, scalable, and observable MLOps systems that support model training, ground truth generation, and continuous evaluation at AV scale