Senior System Software Engineer - Performance

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

Senior System Software Engineer - Performance role at NVIDIA, focusing on optimizing software for next-generation SoCs (CPUs and CPU+GPU Superchips) in datacenter products. Responsibilities include designing, developing, testing, and optimizing software, reviewing performance bottlenecks, and influencing performance optimizations across NVIDIA's software products and SDKs. Requires a BS/MS degree in a related field, 6+ years of experience in computer architecture or SW development, and strong skills in performance analysis and multicore systems.

What you'd actually do

  1. Design, develop, test, and optimize software for our next-generation SoCs. In both pre-silicon and post-silicon phases of execution.
  2. Review architectural performance bottlenecks for various system wide work loads. Identify HW/SW policies to drive performance and performance/watt leadership.
  3. Using strong communication skills, build and drive architecture, analysis documents and communications to internal and/or external audiences about our technology.
  4. Competitive analysis comparing uArchitecture & workload performance metrics on NVIDIA's ARM SoCs against emerging processors from other silicon vendors.
  5. Influence and drive full-stack adoption of performance optimizations and best practices across NVIDIA SW products & OSS SDKs

Skills

Required

  • BS or MS degree in Computer Engineering, Computer Science, or related degree (or equivalent experience).
  • 6+ years of relevant computer architecture or SW development experience.
  • Proven leadership skills and strong ownership on past projects.
  • Hands on technical experience and demonstrated excellence in an environment with complex software and hardware designs.
  • Strong understanding of multicore hardware, operating systems design, concurrency, virtual memory, caching, interrupts, device drivers and real-time programming.
  • Strong stills in performance analysis, data analysis and performance optimization.

Nice to have

  • Deep expertise in ARM architecture and SW ecosystem.
  • Proficient in analyzing, debugging and tuning performance of complex system software stacks.
  • Experience with CPU server system workloads and performance analysis.
  • Familiarity with CUDA programming and/or GPUs.
  • Experience with HPC or large-scale computing environments.

What the JD emphasized

  • Proven leadership skills and strong ownership on past projects.
  • Hands on technical experience and demonstrated excellence in an environment with complex software and hardware designs.
  • Strong understanding of multicore hardware, operating systems design, concurrency, virtual memory, caching, interrupts, device drivers and real-time programming.
  • Strong stills in performance analysis, data analysis and performance optimization.