Senior System Software Engineer, Kubernetes and Kubevirt

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1 · Remote

Senior Software Engineer to design and operate a cloud platform powering GPU-powered services, focusing on Kubernetes and KubeVirt infrastructure. The role involves developing and extending Kubernetes/KubeVirt components to support NVIDIA's cloud offerings and hardware platforms, with opportunities to contribute to upstream communities.

What you'd actually do

  1. Design, implement, and operate cloud platform services that provide GPU‑accelerated IaaS on top of Kubernetes and KubeVirt.
  2. Develop and extend Kubernetes and KubeVirt components (e.g., operators/controllers, CRDs, device plugins) to support GeForce NOW and new NVIDIA hardware platforms.
  3. Drive the underlying technology stack: influence architecture, coding standards, observability, and deployment methodology for high‑scale, high‑availability services.
  4. Collaborate closely with product, hardware, and other engineering teams to deliver new capabilities end‑to‑end, including leading design discussions and aligning engineering leads on architecture and technology choices.
  5. Lead performance tuning, scalability improvements, and pervasive automation across the stack (provisioning, testing, deployment, operations).

Skills

Required

  • Go (GoLang)
  • Kubernetes
  • KubeVirt
  • Distributed systems
  • Cloud infrastructure
  • Containers
  • CI/CD pipelines
  • Production operations
  • Virtualization (KVM/QEMU/libvirt)
  • Container orchestration
  • Load balancing
  • Security
  • Large-scale multi-tenant cloud platforms
  • APIs (REST/gRPC)

Nice to have

  • Upstream contributions to Kubernetes, KubeVirt, or related CNCF/open source projects
  • Kubernetes device plugins or similar integrations for CPU/GPU/accelerator/network hardware
  • AI-assisted development tools

What the JD emphasized

  • 6+ years of hands‑on experience building software and/or scalable cloud services
  • Significant experience building distributed systems or cloud‑scale services, including well‑designed APIs (e.g., REST/gRPC).
  • Experience with cloud infrastructure: containers, Kubernetes, CI/CD pipelines, and production operations.
  • Proven skills developing in Go (GoLang), including working with Kubernetes/KubeVirt APIs and custom resources.
  • Deep understanding in at least some of these areas: virtualization (KVM/QEMU/libvirt, KubeVirt), container orchestration, distributed systems, load balancing, security, or large‑scale multi‑tenant cloud platforms.