Network Engineer - Operation

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

Network Engineer role focused on the operation and incident response for ByteDance's global network infrastructure, including backbone, data center, and cloud networks. Responsibilities include supporting daily operations, leading troubleshooting during failures, collaborating with other engineering teams, and conducting root cause analysis.

What you'd actually do

  1. Act as a key engineer responsible for ByteDance's global network incident response and operational stability, including backbone, data center, public cloud, Edge/CDN and Network GWs.
  2. Support the day-to-day operations of the Non-CN global network infrastructure and responder for network incidents and emergencies, leading troubleshooting, mitigation, and service restoration during network failures.
  3. Partner with the 24*7 NOC team as a second-line (L2) operations engineer, collaborating with engineers across multiple time zones to ensure continuous 24*7 global network operations.
  4. Collaborate with network architecture, deployment, and software engineering teams to improve operational tooling, monitoring platforms, and automation systems.
  5. Coordinate with ISPs, vendors, and internal engineering teams to diagnose and resolve network issues as well as conduct post-incident root cause analysis (RCA) and drive remediation to prevent recurrence.

Skills

Required

  • network operations
  • network support
  • network engineering
  • TCP/IP
  • DHCP
  • BGP
  • OSPF/IS-IS
  • MPLS
  • troubleshoot
  • resolve network incidents
  • communication skills
  • collaborate across global teams

Nice to have

  • large-scale data center networks
  • cloud infrastructure
  • OTT platforms
  • ISP environments
  • incident management
  • operational process optimization
  • network troubleshooting
  • network vendors
  • ISPs
  • service providers
  • network automation tools
  • Python
  • Shell

What the JD emphasized

  • network incident response
  • operational stability
  • troubleshooting
  • network incidents
  • network failures
  • network issues