Data Center Facilities Manager

ByteDance ByteDance · Big Tech · Ashburn, VA · Infrastructure

Manage and optimize critical data center infrastructure (power and cooling) for a hyper-scale company, ensuring 100% uptime, energy efficiency, and operational excellence. Responsibilities include people leadership, operational governance, vendor management, financial oversight, risk management, and supporting deployment and commissioning.

What you'd actually do

  1. Lead, mentor, and develop a high-performing team of data center facility operation engineers and technicians; build a culture of accountability, safety, and continuous improvement.
  2. Accountable for site uptime and strict adherence to strict Service Level Agreements (SLAs). Serve as the escalation point for major site incidents.
  3. Manage critical colocation and vendor partnership, driving Key Performance Indicators (KPIs) and operational governance.
  4. Own the site operational budget (OpEx) and forecast lifecycle capital improvement projects (CapEx).
  5. Serve as the assigned site authority for Critical Environment Work Authorizations (CEWA) and high-risk Method of Procedures (MOPs).

Skills

Required

  • Bachelor’s Degree in Electrical Engineering, Mechanical Engineering, or a related technical discipline
  • 5+ years of experience in critical infrastructure operations (Data Centers, Semiconductor Fabs, or Power Plants)
  • Deep understanding of data center tiering standards (TIA-942 Rated 3/4, Uptime Tier III/IV), Power Redundancy (2N, N+1 distributed), and Cooling Redundancy (Concurrent Maintainability)
  • Proven track record of managing high-risk change management (MOPs/SOPs/EOPs) and conducting Root Cause Analysis (RCA) for complex electrical/mechanical failures
  • Strong practical knowledge of Megawatt-class Diesel Generators, Static/Rotary UPS, Medium/Low Voltage Switchgears, Chillers, Cooling Towers, CRAH/AHU units, and BMS/EPMS systems

Nice to have

  • CDCP (Certified Data Center Professional), CDFM (Certified Data Center Facilities Manager), PMP, or equivalent professional engineering licenses
  • Demonstrated experience managing large-scale vendors/colocation providers and controlling OpEx/CapEx budgets
  • Outstanding communication, cross-functional collaboration, and conflict-resolution skills
  • Ability to communicate complex technical incidents clearly to executive leadership
  • At least 2+ years experience in a people management, supervisory, or site-lead role within a hyperscale or large-scale colocation environment
  • Ability to thrive in a fast-paced, ambiguous environment and participate in an on-call rotation for emergency escalation

What the JD emphasized

  • 100% uptime
  • strict adherence to strict Service Level Agreements (SLAs)
  • high-risk Method of Procedures (MOPs)
  • zero-injury safety culture
  • compliance with global and local environmental, health, and safety (EHS) regulations