Senior GPU System Architect

NVIDIA NVIDIA · Semiconductors · Bangalore, India

NVIDIA is seeking a Senior GPU System Architect to design multi-GPU scale-up and scale-out datacenter systems for AI and HPC. The role involves architecting system topologies, defining interconnects (NVLink, Ethernet), collaborating on RDMA, using system models for analysis, and co-designing hardware-software stacks for efficient AI workload deployment.

What you'd actually do

  1. Architect multi-GPU system topologies for scale-up and scale-out configurations, balancing AI throughput, scalability, and resilience.
  2. Define, modify and evaluate future architectures for high-speed interconnects such as NVLink and Ethernet co-designed with the GPU memory system.
  3. Collaborate with other teams to architect RDMA-capable hardware and define transport layer optimizations for GPU-based large scale AI workload deployments.
  4. Use and modify system models, perform simulations and bottleneck analyses to guide design trade-offs.
  5. Work with GPU ASIC, compiler, library and software stack teams to enable efficient hardware-software co-design across compute, memory, and communication layers.

Skills

Required

  • BS/MS/PhD in Electrical Engineering, Computer Engineering, or equivalent area.
  • 8 years or more of relevant experience in system design and/or ASIC/SoC architecture for GPU, CPU or networking products.
  • Deep understanding of communication interconnect protocols such as NVLink, Ethernet, InfiniBand, CXL and PCIe.
  • Experience with RDMA/RoCE or InfiniBand transport offload architectures.
  • Proven ability to architect multi-GPU/multi-CPU topologies, with awareness of bandwidth scaling, NUMA, memory models, coherency and resilience.
  • Experience with hardware-software interaction, drivers and runtimes, and performance tuning for modern distributed computing systems.
  • Strong analytical and system modeling skills (Python, SystemC, or similar).
  • Excellent cross-functional collaboration skills with silicon, packaging, board, and software teams.

Nice to have

  • Background in system design for AI and HPC.
  • Experience with NICs or DPU architecture and other transport offload engines.
  • Expertise in chiplet interconnect architectures or multi-node fabrics and protocols for distributed computing.
  • Hands-on experience with interposer or 2.5D/3D package co-design.

What the JD emphasized

  • 8 years or more of relevant experience in system design and/or ASIC/SoC architecture for GPU, CPU or networking products.
  • Deep understanding of communication interconnect protocols such as NVLink, Ethernet, InfiniBand, CXL and PCIe.
  • Experience with RDMA/RoCE or InfiniBand transport offload architectures.
  • Proven ability to architect multi-GPU/multi-CPU topologies, with awareness of bandwidth scaling, NUMA, memory models, coherency and resilience.
  • Experience with hardware-software interaction, drivers and runtimes, and performance tuning for modern distributed computing systems.
  • Strong analytical and system modeling skills (Python, SystemC, or similar).

Other signals

  • GPU system architecture for AI
  • multi-GPU scale-up and scale-out systems
  • high-speed interconnects
  • hardware-software co-design