Senior System Software Engineer, Performance - Cuda Driver

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior System Software Engineer focused on the CUDA driver and runtime for GPU acceleration. The role involves analyzing application performance, identifying bottlenecks in software/hardware, and developing features/optimizations for NVIDIA hardware across various computational workloads including deep learning, scientific computation, and more. Responsibilities include evangelizing, architecting, and implementing new features, analyzing full-stack performance, defining API improvements, and creating system software optimizations.

What you'd actually do

  1. Evangelize, architect, and implement new features
  2. Oversee and drive development efforts across multiple teams
  3. Analyze full stack performance ranging from application level through libraries, system software, kernel software and hardware
  4. Define forward-looking improvements to the CUDA APIs and programming model
  5. Create novel system software optimizations

Skills

Required

  • BS or MS degree in Computer Science, Electrical Engineering (or equivalent experience)
  • 7+ years of related development experience
  • Strong C programming skills
  • Experience working with large codebases
  • Track record of debugging performance problems in complex environments with software and hardware components
  • Experience with operating system interfaces for threads, process control, and virtual memory
  • Experience writing and debugging multithreaded programs
  • Deep understanding of technology
  • Strong collaborative and interpersonal skills
  • proven ability to effectively guide and influence within a dynamic matrix environment
  • Good written communication

Nice to have

  • Understanding of system level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped IO
  • Experience with performance tuning of device drivers or low level system software
  • Experience with performance optimizations across a variety of CPU architectures - like x86, POWER and ARM
  • Knowledge of memory coherence and consistency models
  • Experience with Windows, Linux, or macOS driver development

What the JD emphasized

  • 7+ years of related development experience
  • Strong C programming skills
  • Track record of debugging performance problems in complex environments with software and hardware components
  • Experience with operating system interfaces for threads, process control, and virtual memory
  • Experience writing and debugging multithreaded programs