Senior Data Center Connectivity Engineer

NVIDIA NVIDIA · Semiconductors · CA +5 · Remote

This role focuses on the physical infrastructure and connectivity for large-scale AI data centers, translating network designs into physical builds and optimizing cabling, pathways, and rack layouts for AI deployments. It involves co-designing with power, cooling, and software teams, and selecting hardware for AI training and inference clusters.

What you'd actually do

  1. Own the development of connectivity reference designs based on requirements from cluster architecture, network engineering, infrastructure software and product hardware teams.
  2. Build and develop comprehensive documentation, including detailed rack elevations and network architecture diagrams and cabling point-to-point list. Support projects throughout design and deployment phases.
  3. Serve as the primary engineering support, closely collaborating with deployment and field teams to ensure successful cluster build-out and operation.
  4. Strategically co-design the cluster with power and cooling infrastructure teams, ensuring a thorough understanding of all facility architectural requirements (Arch, power, cooling).
  5. Work with hardware, network and security teams to translate software stack requirements into physical requirements: hardware selection, fault domain, network architecture.

Skills

Required

  • Connectivity
  • Network architecture
  • Engineering
  • Hyperscale Cloud Provider experience
  • Large-scale enterprise data center experience
  • High-Performance Computing (HPC) environment experience
  • Designing, deploying, and operating network fabrics for thousands of GPU/CPU nodes
  • High-speed interconnect technologies (InfiniBand, RoCE, RDMA)
  • Connectivity solutions for high-density GPU clusters
  • Data center infrastructure (rack power/cooling, cable management, physical density constraints)
  • Leading multidisciplinary teams
  • Sophisticated technical initiatives

Nice to have

  • NVIDIA's compute and network product families and deployment standards
  • Network engineering
  • MEP systems
  • Infrastructure as a Service software layer
  • Field deployments
  • Global reference design documentation

What the JD emphasized

  • Minimum of 12+ years in a connectivity, network architecture or engineering role within a Hyperscale Cloud Provider, large-scale enterprise data center, or High-Performance Computing (HPC) environment.
  • Consistent record of designing, deploying, and operating network fabrics for thousands of GPU/CPU nodes.
  • Deep expertise in high-speed interconnect technologies, including InfiniBand, RoCE, and RDMA.
  • Proven experience designing connectivity solutions for high-density GPU clusters (100kW+ per rack) and understanding the unique front-end and back-end requirements for AI training vs. inference.
  • Deep understanding of data center infrastructure, including rack power/cooling, cable management, and physical density constraints.