Distinguished Engineer - Rack Scale Architecture

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1 · Remote

NVIDIA is seeking a Distinguished Engineer to drive the software end-to-end architecture for their rack-scale products. This role involves maintaining a deep understanding of the product portfolio and roadmap, ensuring high-quality and reliable software, and working directly with major customers and business partners. The engineer will develop roadmaps for new technologies and protocols, mentor teams, and make key technical decisions. The role requires extensive experience in system architecture, networking technology, complex system software for accelerators, and management architectures.

What you'd actually do

  1. Drive the software end-to-end architecture for NVIDIA's rack-scale products
  2. Maintain deep understanding of the product portfolio and roadmap; translate forward-looking plans into clear, formal software requirements that anchor execution across the organization.
  3. Ensure high quality & reliable software; serving as a trusted architectural partner to teams requiring guidance or oversight.
  4. Work directly with major customers to understand their requirements and work to align their roadmap with NVIDIA’s roadmap.
  5. Work with business partners and vendors to shape their products to meet NVIDIA’s needs.

Skills

Required

  • BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.
  • 15+ years in the area of System architecture and design
  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband)
  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs
  • Expertise in out-of-band and in-band management architectures.
  • Knowledge of system management protocols such as Redfish and IPMI.
  • Experience working with platform security experts to define tradeoffs between security and ease of use.
  • Demonstrable experience in implementing left shift strategy to de-risk program execution.
  • Excellent written and verbal communication skills.

Nice to have

  • Knowledge of large-scale cloud and cluster level deployment and management systems.
  • Experience with designing robust, resilient and performant scale-up fabrics
  • Demonstrated track record of leading data center products across the entire lifecycle, spanning inception, pre-silicon development, post-silicon bring-up, manufacturing, and deployment.
  • Familiarity with CXL, UCIE and other C2C technology architectures.
  • Knowledge in storage and networking technologies.

What the JD emphasized

  • 15+ years in the area of System architecture and design
  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband)
  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs
  • Expertise in out-of-band and in-band management architectures.
  • Knowledge of system management protocols such as Redfish and IPMI.
  • Experience working with platform security experts to define tradeoffs between security and ease of use.
  • Demonstrable experience in implementing left shift strategy to de-risk program execution.