Software Engineer, ML Networking

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Software Engineer specializing in network infrastructure and optimization for AI accelerators. Responsibilities include building and maintaining software that interfaces between accelerators and high-speed networks, requiring deep technical knowledge of network protocols, kernel/user-space networking, hardware interfacing, and debugging distributed software at the network level.

What you'd actually do

  1. Build a system for accelerator-initiated tensor movement over the network
  2. Benchmark software for a new networking environment
  3. Implement a new collective algorithm to improve latency
  4. Optimize congestion control algorithms for large-scale synchronous workloads
  5. Debug kernel-level network latency spikes

Skills

Required

  • Network protocols and networking concepts
  • Kernel networking (TCP/IP stack internals, XDP, eBPF, io_uring, epoll)
  • User-space networking (DPDK, RDMA, kernel bypass)
  • Systems programming (memory management, lock-free data structures, NUMA-aware programming)
  • Debugging distributed systems

Nice to have

  • ML accelerators and accelerator drivers
  • Design new network protocols
  • PCIe and drivers for PCIe devices
  • Algorithms used in networking (compression, graph algorithms)
  • Programming on SmartNICs
  • Rust
  • HPC, telecommunications, host networking software, OS/kernel engineering, or embedded systems

What the JD emphasized

  • Expert-level proficiency with network protocols and networking concepts
  • Deep kernel networking: TCP/IP stack internals, XDP, eBPF, io_uring, and epoll
  • User-space networking: DPDK, RDMA, kernel bypass techniques
  • Strong programming skills in a systems programming language, including memory management, lock-free data structures, and NUMA-aware programming