Research Scientist - Dpu & AI Infra

ByteDance ByteDance · Big Tech · Seattle, WA · Infrastructure

Research Scientist role focused on designing and developing DPU network software for AI/ML workloads, including distributed training and inference acceleration, and software-hardware co-design.

What you'd actually do

  1. Design and develop DPU network software with a focus on high performance, low latency, and reliability.
  2. Collaborate with hardware teams to build software-hardware co-design solutions for networking and storage acceleration.
  3. Explore AI/ML infrastructure acceleration, leveraging DPUs, GPUs, and custom hardware to optimize distributed training and inference.
  4. Drive end-to-end performance optimization, from OS kernels and drivers to user-space runtime systems.
  5. Contribute to architecture design, technical proposals, and long-term research directions.

Skills

Required

  • C/C++ development and debugging
  • Linux systems development
  • compute, network architecture, and operating systems
  • software-hardware co-design
  • distributed systems
  • high-performance networking
  • AI/ML systems

Nice to have

  • Ph.D. in related fields with research training and publications
  • network virtualization (OVS, SR-IOV, eBPF)
  • DPDK and high-performance user-space networking
  • hardware acceleration experience, FPGA/ASIC/GPU/CUDA
  • NCCL Collectives along with AI communication patterns and parallelization techniques
  • inference kv cache system
  • data preprocessing system

What the JD emphasized

  • Ph.D. with strong research/publications
  • strong research/publications
  • AI/ML systems
  • AI communication patterns

Other signals

  • AI/ML infrastructure acceleration
  • distributed training and inference
  • DPU, GPU, custom hardware