Senior Network Engineer

Together AI Together AI · Data AI · San Francisco, CA · Engineering

Senior Network Engineer responsible for designing, implementing, and maintaining network infrastructure for AI company's user-facing services and production systems. Focus on routing, switching, network security, and protocols, with an emphasis on automation and HPC-based data center networking. Experience with large-scale hybrid data center networks, TCP/IP, BGP, OSPF, VXLAN, EVPN, QoS, and network automation tools (Python, Ansible). Proficient in network troubleshooting tools and Linux environments. Experience with cloud networks (AWS, GCP, Azure) and multi-vendor network devices (Cisco, Arista, Juniper, Mellanox). Preferred knowledge of RoCE, Infiniband, Docker, Kubernetes, Slurm, and AI training workloads.

What you'd actually do

  1. Design, deploy, manage and maintain global multi-vendor, multi-protocol high performance compute networks.
  2. Analyze data to diagnose and identify root causes to network issues to minimize downtime
  3. Evaluate and recommend network technologies, hardware, and software solutions.

Skills

Required

  • TCP/IP networking architecture and technologies such as BGP, OSPF, VXLAN, EVPN, and QoS
  • network automation pipelines using Python, Ansible
  • Wireshark, tcpdump, nmap, MTR, and curl
  • multi-tenant networks
  • Cisco, Arista, Juniper, and Mellanox
  • cloud networks such as AWS, GCP, and Azure
  • Linux environment

Nice to have

  • RoCE and Infiniband protocols
  • Docker, Kubernetes, or Slurm
  • AI training workloads

What the JD emphasized

  • 8+ years of professional experience building, managing, and supporting large-scale hybrid data center networks (excluding enterprise networks).