Senior Manager, System Software Engineering - Metropolis Accelerated and Inferencing Software

NVIDIA NVIDIA · Semiconductors · Pune, India

Senior Manager for System Software Engineering at NVIDIA, focusing on Metropolis Accelerated and Inferencing Software. The role involves leading engineering teams, driving strategic implementations of inference solutions (TensorRT, VLLM) for edge and enterprise devices, performance benchmarking, and technical leadership in deep learning. Requires extensive experience in machine learning/deep learning, embedded software, GPU/CPU optimization, and multimodal AI systems.

What you'd actually do

  1. Lead, encourage, and develop world-class engineering teams distributed across various India locations.
  2. Drive Strategic Implementations of TensorRT, VLLM and other accelerated frameworks for inference solutions for Edge and Enterprise devices: Lead Accelerated Computing efforts and solutions for key Metropolis verticals. Set up Proofs of Readiness (PORs) and guide their implementations.
  3. Performance Benchmarking: Orchestrate efforts to achieve leading performance results on industry benchmarks like MLPerf on various edge and Enterprise devices.
  4. Technical Leadership & Influence: Function as a technical leader for deep learning across multiple teams, giving oversight and build support. Apply customer insights to influence the composition and structure of upcoming SOC / GPU deep learning hardware.
  5. Scaling the team: Strategically hiring to meet new demands while also mentoring and adjusting existing teams to new deep learning challenges.

Skills

Required

  • Masters in Computer Science/Electrical Engineering or equivalent experience
  • 8 years of meaningful involvement in machine learning/deep learning research or practical experience
  • 6+ years of leadership background
  • 12+ years of industry experience
  • Over 10 years of validated expertise in the embedded software sector
  • Deep Knowledge of GPU, CPU and dedicated deep learning architecture fundamentals
  • low-level performance optimizations using heterogeneous computing
  • Hands-on experience with Multimedia Frameworks, Computer Vision, VLMs, LLMs, or multimodal AI systems
  • Strong expertise in large-scale data processing, systems build, or machine learning pipelines
  • Strong communication, careful planning, and technical leadership capabilities

Nice to have

  • PhD or equivalent experience in a relevant field
  • Leadership role in production deployment of Smart Spaces, Physical AI
  • Deep understanding of constraints and advancements of sensing, computing, and model architecture evolutions
  • Ability to lead and drive global teams across multiple continents and time zones
  • Deep experience with CV, LLMs, VLMs, GenAI Models, and standards

What the JD emphasized

  • hands-on with deep learning
  • deep experience tuning for NVIDIA GPUs
  • proven record delivering robust, low-latency inference at scale
  • Over 10 years of validated expertise in the embedded software sector
  • Deep Knowledge of GPU, CPU and dedicated deep learning architecture fundamentals and low-level performance optimizations

Other signals

  • leading performance results on industry benchmarks
  • low-latency inference at scale
  • driving strategic implementations of inference solutions