What you'd actually do

Lead day-to-day operations of mission-critical facility infrastructure across AI compute campuses.

Own operational readiness activities supporting new campus deployments and infrastructure expansion.

Partner with commissioning teams to transition facilities from construction and startup into steady-state operations.

Develop, implement, and continuously improve operating procedures, maintenance programs, and response plans.

Lead infrastructure incident response efforts and coordinate recovery activities during critical events.

Skills

Required

Experience operating mission-critical facilities, data centers, industrial infrastructure, or large-scale technical operations environments
Strong knowledge of electrical distribution systems, generators, UPS systems, cooling systems, and building controls
Experience supporting commissioning, operational readiness, or infrastructure turnover programs
Experience leading facility operations teams, contractors, and third-party vendors
Experience responding to incidents and making decisions in high-pressure operational environments
Experience developing maintenance strategies, operating procedures, and reliability programs
Ability to operate in fast-paced environments with significant ambiguity and rapid growth
Effective communication across technical and non-technical stakeholders

Nice to have

Experience supporting hyperscale, cloud, AI, HPC, or mission-critical data center environments
Experience with liquid cooling systems and high-density compute deployments
Familiarity with reliability engineering methodologies, root cause analysis, and preventative maintenance programs
Experience supporting large-scale infrastructure deployment programs
Experience working across construction, commissioning, engineering, and operations organizations
Experience scaling operational processes across multiple campuses or geographic regions

What the JD emphasized

8+ years of experience operating mission-critical facilities, data centers, industrial infrastructure, or large-scale technical operations environments.

Possess strong knowledge of electrical distribution systems, generators, UPS systems, cooling systems, and building controls.

Have experience supporting commissioning, operational readiness, or infrastructure turnover programs.

Have led facility operations teams, contractors, and third-party vendors.

Are comfortable responding to incidents and making decisions in high-pressure operational environments.

Have experience developing maintenance strategies, operating procedures, and reliability programs.

Enjoy operating in fast-paced environments with significant ambiguity and rapid growth.

About the Team

OpenAI is helping build the infrastructure that powers the next generation of artificial intelligence. Through Stargate, we are developing and operating large-scale AI compute campuses that require world-class execution across data center design, construction, commissioning, and operations.

The Infrastructure Operations team is responsible for bringing AI infrastructure online and ensuring it operates reliably at scale. We partner closely with hardware, network, deployment, construction, and operations teams to deliver mission-critical environments capable of supporting frontier AI workloads. As our footprint expands, operational excellence becomes increasingly important to ensuring safe, reliable, and efficient campus operations.

About the Role

We are seeking a Facilities Operations Manager to support the commissioning, operational readiness, and long-term operation of next-generation AI data center campuses.

This role sits at the intersection of construction, commissioning, hardware deployment, and facilities operations. You will be responsible for ensuring mission-critical infrastructure is prepared to support hardware deployment, transitioned successfully into production operations, and maintained to the highest standards of reliability and availability.

You will lead day-to-day operational execution across electrical, mechanical, controls, and supporting infrastructure systems while partnering closely with commissioning teams, site operators, vendors, and engineering organizations. This role requires a strong blend of technical depth, operational leadership, and cross-functional execution.

Key Responsibilities

Lead day-to-day operations of mission-critical facility infrastructure across AI compute campuses.
Own operational readiness activities supporting new campus deployments and infrastructure expansion.
Partner with commissioning teams to transition facilities from construction and startup into steady-state operations.
Develop, implement, and continuously improve operating procedures, maintenance programs, and response plans.
Lead infrastructure incident response efforts and coordinate recovery activities during critical events.
Drive root cause analysis investigations and corrective action programs to improve reliability and operational performance.
Manage vendors, contractors, and service providers supporting facility operations.
Partner with hardware deployment, networking, and engineering teams to coordinate infrastructure changes and maintenance activities.
Monitor facility performance, operational risk, and capacity utilization across critical systems.
Support staffing, training, and development of facilities operations personnel.
Ensure compliance with safety, environmental, and operational standards.
Establish operational processes that scale alongside OpenAI's rapidly growing infrastructure footprint. (OpenAI)

Qualifications

8+ years of experience operating mission-critical facilities, data centers, industrial infrastructure, or large-scale technical operations environments.
Possess strong knowledge of electrical distribution systems, generators, UPS systems, cooling systems, and building controls.
Have experience supporting commissioning, operational readiness, or infrastructure turnover programs.
Have led facility operations teams, contractors, and third-party vendors.
Are comfortable responding to incidents and making decisions in high-pressure operational environments.
Have experience developing maintenance strategies, operating procedures, and reliability programs.
Enjoy operating in fast-paced environments with significant ambiguity and rapid growth.
Communicate effectively across technical and non-technical stakeholders.

Preferred Skills

Experience supporting hyperscale, cloud, AI, HPC, or mission-critical data center environments.
Experience with liquid cooling systems and high-density compute deployments.
Familiarity with reliability engineering methodologies, root cause analysis, and preventative maintenance programs.
Experience supporting large-scale infrastructure deployment programs.
Experience working across construction, commissioning, engineering, and operations organizations.
Experience scaling operational processes across multiple campuses or geographic regions.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.