What you'd actually do

Lead end-to-end delivery programs that convert external infrastructure capacity into production-ready token supply.

Own readiness across compute, storage, networking, security, and operational dependencies for third-party environments.

Build integrated plans across internal engineering teams and external partners with clear milestones, owners, risks, and critical paths.

Drive launch execution for new partner regions, clusters, and capacity expansions.

Create operating mechanisms that measure deployed capacity versus usable token output.

Skills

Required

8+ years of Technical Program Management, Engineering Program Management, or Infrastructure Delivery experience.
Experience leading large-scale technical programs involving cloud, data center, networking, hardware, or distributed systems.
Strong understanding of compute infrastructure, clusters, networking, storage, and production systems.
Proven ability to drive cross-functional execution across engineering, operations, finance, and external vendors.
Experience managing executive stakeholders and communicating complex tradeoffs clearly.
Strong analytical skills with ability to reason about utilization, throughput, capacity, and operational metrics.
Comfortable operating in ambiguous, fast-scaling environments.
Strong written and verbal communication skills.
High ownership mentality with bias toward action.

Nice to have

Experience with GPU clusters, AI infrastructure, or large-scale model serving environments.
Familiarity with token economics, inference capacity planning, or workload scheduling.
Experience scaling global infrastructure through third-party providers.
Background in systems engineering, networking, or hardware deployment programs.
Experience building new operational models in high-growth environments.
Experience working with external providers, strategic partners, or hyperscalers is highly preferred.

What the JD emphasized

external infrastructure

production throughput

external partners

scaling strategy

token supply

partner environments

internal teams and external partners

scale model training and inference globally

external infrastructure capacity

third-party environments

internal engineering teams and external partners

partner regions

partner onboarding

external stakeholders

external providers

large-scale technical programs

distributed systems

external vendors

fast-scaling environments

external providers

global infrastructure

third-party providers

About the Team

OpenAI’s Stargate and 3P Engineering teams are responsible for building and scaling the external infrastructure ecosystem that powers advanced AI systems. We work across hyperscalers, colocation providers, cloud partners, and strategic third-party operators to turn contracted capacity into production-ready compute.

Our scope spans the full lifecycle of external deployments: commercial alignment, technical readiness, network integration, hardware enablement, operational readiness, and long-range scaling strategy.

As OpenAI’s infrastructure footprint expands globally, we need leaders who can convert complex partner environments into reliable, high-velocity capacity for training and inference workloads.

About the Role

We are seeking a Technical Program Manager, Token-as-a-Service (TaaS) to lead delivery of external compute capacity that directly serves OpenAI model workloads.

In this role, you will own complex cross-functional programs that transform third-party infrastructure into usable tokens at scale. You will partner across engineering, capacity planning, networking, hardware, finance, product, and external providers to ensure that deployed capacity translates into real production throughput.

This role sits at the intersection of infrastructure execution, systems readiness, and business impact. Success requires strong technical fluency, elite program management, and the ability to drive accountability across internal teams and external partners.

This is a high-visibility role with direct impact on OpenAI’s ability to scale model training and inference globally.

This role is based in San Francisco, CA, with a hybrid work model of 3 days in office per week. Relocation assistance is available.

Key Responsibilities

Lead end-to-end delivery programs that convert external infrastructure capacity into production-ready token supply.
Own readiness across compute, storage, networking, security, and operational dependencies for third-party environments.
Build integrated plans across internal engineering teams and external partners with clear milestones, owners, risks, and critical paths.
Drive launch execution for new partner regions, clusters, and capacity expansions.
Create operating mechanisms that measure deployed capacity versus usable token output.
Identify bottlenecks preventing token generation (network constraints, hardware readiness, software enablement, partner delays, etc.) and drive resolution.
Coordinate with capacity planning and finance teams to prioritize the highest ROI capacity opportunities.
Establish executive-level reporting on delivery status, risks, and token ramp forecasts.
Improve repeatability of partner onboarding, technical integration, and scaling motions.
Manage escalations across internal and external stakeholders during high-severity delivery issues.
Translate ambiguous infrastructure constraints into clear execution plans.
Help define the long-term operating model for Token-as-a-Service across Stargate and 3P ecosystems.

Qualifications

8+ years of Technical Program Management, Engineering Program Management, or Infrastructure Delivery experience.
Experience leading large-scale technical programs involving cloud, data center, networking, hardware, or distributed systems.
Strong understanding of compute infrastructure, clusters, networking, storage, and production systems.
Proven ability to drive cross-functional execution across engineering, operations, finance, and external vendors.
Experience managing executive stakeholders and communicating complex tradeoffs clearly.
Strong analytical skills with ability to reason about utilization, throughput, capacity, and operational metrics.
Comfortable operating in ambiguous, fast-scaling environments.
Strong written and verbal communication skills.
High ownership mentality with bias toward action.
Experience working with external providers, strategic partners, or hyperscalers is highly preferred.

Preferred Skills

Experience with GPU clusters, AI infrastructure, or large-scale model serving environments.
Familiarity with token economics, inference capacity planning, or workload scheduling.
Experience scaling global infrastructure through third-party providers.
Background in systems engineering, networking, or hardware deployment programs.
Experience building new operational models in high-growth environments.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.