Facilities Operations Manager

OpenAI OpenAI · AI Frontier · United States · Remote · Scaling

This role is for a Facilities Operations Manager responsible for the commissioning, operational readiness, and long-term operation of next-generation AI data center campuses. The role involves managing day-to-day operations of critical facility infrastructure, developing operating procedures, leading incident response, and ensuring compliance with safety and operational standards. While the role supports AI infrastructure, the core craft is facilities operations, not AI/ML development.

What you'd actually do

  1. Lead day-to-day operations of mission-critical facility infrastructure across AI compute campuses.
  2. Own operational readiness activities supporting new campus deployments and infrastructure expansion.
  3. Partner with commissioning teams to transition facilities from construction and startup into steady-state operations.
  4. Develop, implement, and continuously improve operating procedures, maintenance programs, and response plans.
  5. Lead infrastructure incident response efforts and coordinate recovery activities during critical events.

Skills

Required

  • Experience operating mission-critical facilities, data centers, industrial infrastructure, or large-scale technical operations environments
  • Strong knowledge of electrical distribution systems, generators, UPS systems, cooling systems, and building controls
  • Experience supporting commissioning, operational readiness, or infrastructure turnover programs
  • Experience leading facility operations teams, contractors, and third-party vendors
  • Experience responding to incidents and making decisions in high-pressure operational environments
  • Experience developing maintenance strategies, operating procedures, and reliability programs
  • Ability to operate in fast-paced environments with significant ambiguity and rapid growth
  • Effective communication across technical and non-technical stakeholders

Nice to have

  • Experience supporting hyperscale, cloud, AI, HPC, or mission-critical data center environments
  • Experience with liquid cooling systems and high-density compute deployments
  • Familiarity with reliability engineering methodologies, root cause analysis, and preventative maintenance programs
  • Experience supporting large-scale infrastructure deployment programs
  • Experience working across construction, commissioning, engineering, and operations organizations
  • Experience scaling operational processes across multiple campuses or geographic regions

What the JD emphasized

  • 8+ years of experience operating mission-critical facilities, data centers, industrial infrastructure, or large-scale technical operations environments.
  • Possess strong knowledge of electrical distribution systems, generators, UPS systems, cooling systems, and building controls.
  • Have experience supporting commissioning, operational readiness, or infrastructure turnover programs.
  • Have led facility operations teams, contractors, and third-party vendors.
  • Are comfortable responding to incidents and making decisions in high-pressure operational environments.
  • Have experience developing maintenance strategies, operating procedures, and reliability programs.
  • Enjoy operating in fast-paced environments with significant ambiguity and rapid growth.