Senior System Software Engineer - Neural Graphics Performance

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior System Software Engineer focused on optimizing neural graphics performance, specifically Gaussian Splatting and neural reconstruction algorithms, for applications in robotics, healthcare, and AV development. The role involves implementing and optimizing reconstruction/rendering algorithms using CUDA and Slang, optimizing data processing pipelines, and influencing software architecture for performance.

What you'd actually do

  1. Implement, validate, release and maintain highly-optimized reconstruction and rendering algorithms using CUDA and Slang.
  2. Optimize data processing pipelines for low latency and maximum throughput.
  3. Influence software architecture, validation strategy and technical roadmaps to ensure outstanding performance.

Skills

Required

  • Master's of Science in Computer Science or Electrical engineering (or equivalent experience)
  • 3 years of practical experience
  • Strong fundamentals in real-time computer graphics
  • Expertise in GPU-accelerated software with CUDA, Slang, or other shading languages (GLSL, HLSL, Metal)
  • Expertise defining and driving performance metrics through profiling and benchmarking
  • Proficiency with Python and C++
  • Track record releasing production-grade software
  • Excellent software engineering fundamentals (source control, CI/CD, testing/validation, packaging, containerization, release)
  • Excellent written, visual, and verbal communication

Nice to have

  • Contributions to 3D game engines, graphics or computer vision SDKs
  • Algorithmic expertise in neural reconstruction (NERFs, Gaussian Splats)
  • Experience developing high-performance distributed systems
  • Grounding in mathematical fundamentals such as linear algebra, numerical methods, statistics, and exploratory data analysis
  • History of multidisciplinary creativity and innovation around performance in multiple problem domains

What the JD emphasized

  • highly-optimized
  • low latency
  • maximum throughput
  • outstanding performance
  • real-time computer graphics
  • low-latency, high-throughput applications
  • performance metrics
  • performance challenges, tradeoffs, and architectural alternatives
  • performance in multiple problem domains

Other signals

  • Gaussian Splatting
  • Neural reconstruction
  • Omniverse NuRec SDK
  • Physical AI