Token-as-a-service Technical Program Manager

OpenAI OpenAI · AI Frontier · San Francisco, CA · Scaling

This role is for a Technical Program Manager focused on delivering external compute capacity for AI model training and inference. The TPM will lead programs to convert third-party infrastructure into usable tokens, managing readiness across compute, storage, networking, and security. They will build integrated plans, drive launch execution, identify bottlenecks, and establish reporting mechanisms. The role requires strong technical fluency, program management skills, and the ability to drive cross-functional execution with internal teams and external partners in a fast-scaling environment.

What you'd actually do

  1. Lead end-to-end delivery programs that convert external infrastructure capacity into production-ready token supply.
  2. Own readiness across compute, storage, networking, security, and operational dependencies for third-party environments.
  3. Build integrated plans across internal engineering teams and external partners with clear milestones, owners, risks, and critical paths.
  4. Drive launch execution for new partner regions, clusters, and capacity expansions.
  5. Create operating mechanisms that measure deployed capacity versus usable token output.

Skills

Required

  • 8+ years of Technical Program Management, Engineering Program Management, or Infrastructure Delivery experience.
  • Experience leading large-scale technical programs involving cloud, data center, networking, hardware, or distributed systems.
  • Strong understanding of compute infrastructure, clusters, networking, storage, and production systems.
  • Proven ability to drive cross-functional execution across engineering, operations, finance, and external vendors.
  • Experience managing executive stakeholders and communicating complex tradeoffs clearly.
  • Strong analytical skills with ability to reason about utilization, throughput, capacity, and operational metrics.
  • Comfortable operating in ambiguous, fast-scaling environments.
  • Strong written and verbal communication skills.
  • High ownership mentality with bias toward action.

Nice to have

  • Experience with GPU clusters, AI infrastructure, or large-scale model serving environments.
  • Familiarity with token economics, inference capacity planning, or workload scheduling.
  • Experience scaling global infrastructure through third-party providers.
  • Background in systems engineering, networking, or hardware deployment programs.
  • Experience building new operational models in high-growth environments.
  • Experience working with external providers, strategic partners, or hyperscalers is highly preferred.

What the JD emphasized

  • external infrastructure
  • production throughput
  • external partners
  • scaling strategy
  • token supply
  • partner environments
  • internal teams and external partners
  • scale model training and inference globally
  • external infrastructure capacity
  • third-party environments
  • internal engineering teams and external partners
  • partner regions
  • partner onboarding
  • external stakeholders
  • external providers
  • external providers
  • large-scale technical programs
  • distributed systems
  • external vendors
  • fast-scaling environments
  • external providers
  • global infrastructure
  • third-party providers