Ai/llm Network Software Development Engineer

ByteDance ByteDance · Big Tech · Seattle, WA · R&D

Develops and optimizes high-speed network infrastructure and communication frameworks specifically for AI/LLM applications, focusing on performance, scalability, and reliability in large-scale data centers.

What you'd actually do

  1. Design, implementation and deployment of high-speed network technologies to support AI/LLM applications.
  2. Design and development of platforms/systems for monitoring, analysis and diagnosis of large scale AI/LLM network.
  3. Research and development of high-performance AI communication framework, network protocol stacks, and codesign optimization of host-network-application to improve the scalability, reliability and performance of AI/LLM network.
  4. Building next generation AI network infrastructure supporting large scale heterogeneous network hardware with innovative and deployable solutions.

Skills

Required

  • computer network
  • network programming
  • C++
  • Python
  • Go
  • RDMA
  • congestion control
  • AI network optimization

Nice to have

  • high performance communication frameworks
  • NCCL
  • MPI
  • RPC libraries
  • AI network diagnosis
  • performance optimization

Other signals

  • AI/LLM applications
  • AI/LLM network
  • AI communication framework
  • AI network infrastructure