Senior Product Manager, Kubernetes AI Platform and Operational Tools

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1

Product Manager for NVIDIA's Kubernetes AI Platform and Operational Tools, focusing on scaling GPU clusters for AI workloads, developing platform technologies, and bringing innovations to market as open-source projects.

What you'd actually do

  1. Identify gaps in how customers and partners operate GPU clusters at scale and develop new projects to address their needs.
  2. Bring internal platform innovations to market as open source software projects, including community strategy, contribution models, and ecosystem engagement.
  3. Own the product roadmap for AI runtime generation, testing, packaging, and publication across cloud partners and deployment targets.
  4. Drive platform-level cluster provisioning and lifecycle management across NVIDIA Cloud Partners and enterprise environments.
  5. Own our self-service cluster operations surface: the APIs, control planes, and automation that let customers provision, upgrade, and run clusters independently.

Skills

Required

  • Product management experience in Kubernetes platform engineering, cloud infrastructure, or GPU-accelerated compute environments
  • Shipping Kubernetes platform products for hardware-aware compute environments
  • Deep understanding of Kubernetes architecture
  • Experience developing or leading open-source projects
  • Defining multi-quarter strategy and leading execution with multiple engineering teams
  • Working with cloud service providers or platform partners

Nice to have

  • Crafting or leading open-source projects that gained meaningful community adoption
  • AI/ML runtime lifecycle management, container image pipelines, or OCI distribution
  • GPU scheduling, topology-aware placement, or multi-tenant GPU cluster management
  • HPC workload orchestration in production environments
  • Shipping platform products that partners or third-party operators depend on
  • Contributions to Kubernetes SIGs, CNCF projects, or GPU-related open-source work

What the JD emphasized

  • 12+ years of product management experience in Kubernetes platform engineering, cloud infrastructure, or GPU-accelerated compute environments.
  • Experience shipping Kubernetes platform products for hardware-aware compute environments.
  • Deep understanding of Kubernetes architecture: API server, scheduler, controller patterns, CRDs, device plugins, and operator frameworks.
  • Experience developing or leading open-source projects in the cloud-native or infrastructure space.
  • Track record defining multi-quarter strategy and leading execution with multiple engineering teams.
  • Experience working with cloud service providers or platform partners in a delivery or enablement capacity.