Data Center Network Engineer

Baseten Baseten · Data AI · San Francisco, CA · EPD

Designs and owns high-performance network infrastructure for GPU clusters, focusing on cluster fabric design, cabling architecture, and performance validation to support AI model training and inference.

What you'd actually do

  1. Own end-to-end network architecture for data center clusters.
  2. Define topology, redundancy, and performance characteristics.
  3. Select switches, optics, and cabling systems.
  4. Lead network bring-up, validation, and performance testing.
  5. Partner with hardware and platform teams on system-level performance.

Skills

Required

  • Experience designing and operating data center or HPC networks.
  • Strong familiarity with InfiniBand, RDMA, or high-performance Ethernet.
  • Strong hands-on skills in network configuration, debugging, and performance tuning.
  • Experience owning complex systems end-to-end at a senior level.
  • Experience leading technical projects or cross-functional efforts.

Nice to have

  • Prior leadership or mentoring experience is a plus.

What the JD emphasized

  • high-performance network infrastructure
  • GPU clusters
  • cluster fabric design
  • high throughput, low latency, and reliable distributed systems
  • infiniBand or Ethernet-based architectures
  • network performance for distributed workloads
  • end-to-end network architecture
  • network bring-up, validation, and performance testing
  • system-level performance
  • data center or HPC networks
  • InfiniBand, RDMA, or high-performance Ethernet
  • network configuration, debugging, and performance tuning
  • complex systems end-to-end
  • technical projects or cross-functional efforts