Site Reliability Engineer, Edge Services

ByteDance ByteDance · Big Tech · Boston, MA · R&D

Site Reliability Engineer for ByteDance's hybrid Content Distribution Network (CDN) platform, focusing on ensuring stability, reliability, and performance of edge services. Responsibilities include architecting solutions, building automation and monitoring tools, developing operational procedures, and managing large-scale systems including datacenters and global CDNs.

What you'd actually do

  1. Architect and implement solutions that enable both internal and external customers to harness the power of Bytedance’s globally scaled content delivery network.
  2. Build metrics, tools, automations, visualizations and monitors to facilitate the operation and optimization of the edge services.
  3. Develop procedures and workflows that improve efficiency, foster trust, and ensure compliance in operational processes.
  4. Run vulnerability and capacity assessment and develop disaster recovery strategies to ensure high availability of our global CDN services.
  5. Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.

Skills

Required

  • Java
  • C++
  • Go
  • Shell scripting
  • Python scripting
  • CDN performance engineering
  • solution architecting
  • site reliability engineering

Nice to have

  • networking technologies
  • TCP/IP
  • BGP
  • DNS
  • multi-CDN environment
  • OpenStack
  • Kubernetes
  • Nginx
  • ipvs
  • ELK stack
  • Hadoop
  • CDN technologies