Senior Linux Kernel Systems Software Engineer – Csp Engagements

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +3

Senior Software Engineer role focused on system software for Datacenter products (GB200), involving Linux kernel development, device drivers, and system optimizations. The role requires deep expertise in data center server architectures, HPC and AI/ML workloads, and hardware-software co-design, with customer-facing responsibilities to enable cloud service providers. It involves advanced system debugging and performance optimization.

What you'd actually do

  1. Design and develop software solutions for data center servers including Linux kernel modifications, device drivers, and system optimizations for GB200 and next-gen platforms.
  2. Lead hardware bring-up activities, BSP development, and hardware-software co-design for Cloud Service Provider deployments.
  3. Partner directly with CSPs to deliver technical solutions, co-develop & co-debug features and optimizations, and provide support during new product introductions.
  4. Collaborate with cross-functional teams in designing end-to-end solutions spanning firmware, OS, middleware, and applications with focus on AI/ML and HPC workloads.
  5. Perform advanced system debugging, root cause analysis, and performance optimization for large-scale data center environments.

Skills

Required

  • C/C++
  • Python
  • Linux kernel development
  • device drivers
  • system debugging
  • performance analysis
  • computer architecture
  • ARM (aarch64)
  • x86 architectures
  • GDB
  • kdump
  • eBPF tracing
  • PCIe virtualization
  • IOMMU
  • NUMA architectures
  • virtualization
  • Kubernetes
  • cloud-native architectures

Nice to have

  • GPU computing (CUDA)
  • deep learning workloads
  • Out of Band and In-band management architectures
  • Memory fabric
  • CXL architectures

What the JD emphasized

  • Deep expertise in data center server architectures, HPC systems, and hardware-software co-design.
  • Expert knowledge of Linux kernel internals, device drivers, communication protocols (PCIe, USB, Ethernet).
  • Deep understanding of computer architecture, microprocessor concepts, and expert knowledge of ARM (aarch64) and x86 architectures.
  • Ability to debug kernel crash dumps and all types of lock-up issues. Hands-on experience with GDB, kdump, and eBPF tracing to debug multiprocessor systems. Good understanding of ARM and Intel assembly.
  • Strong understanding of PCIe virtualization and IOMMU.
  • Deep understanding of NUMA architectures including memory topology, processor-memory locality, and performance optimization for multi-CPU systems in data center environments.
  • Strong programming skills in C/C++, Python, plus experience with virtualization, Kubernetes, and cloud-native architectures.
  • Skilled in complex system-level debugging, performance analysis, and test design.