Capacity Systems Software Engineer

OpenAI OpenAI · AI Frontier · San Francisco, CA · Scaling

This role focuses on building software platforms and data systems for OpenAI's Industrial Compute organization, which manages the company's large-scale compute infrastructure. The engineer will design and develop systems for capacity planning, forecasting, optimization, and operational decision-making, connecting various functions like infrastructure delivery, fleet health, and financial planning. The goal is to transform fragmented workflows into scalable systems that enable better decisions about compute deployment and resource allocation, ultimately impacting product availability and business performance. This is an internal platform and operational systems role, not directly building AI models.

What you'd actually do

  1. Design and build software systems that serve as the system of record for Industrial Compute planning and operations.
  2. Develop backend services, APIs, workflows, and data platforms that support capacity forecasting, allocation, deployment readiness, and operational planning.
  3. Build applications that connect infrastructure delivery, fleet health, capacity utilization, product demand, and financial planning into a shared operational view.
  4. Build planning and scenario-modeling systems that help leaders understand tradeoffs across capacity, utilization, cost, reliability, launch timing, and business impact.
  5. Create workflow automation and decision-support tooling that improves planning accuracy and reduces operational overhead.

Skills

Required

  • 5+ years of experience in software engineering, platform engineering, infrastructure engineering, or related technical disciplines.
  • Strong programming experience in Python, Go, Java, TypeScript, or similar languages.
  • Experience building distributed systems, backend services, internal platforms, workflow systems, or operational tooling.
  • Experience designing APIs, data pipelines, and integrations across multiple systems.
  • Strong system design and software architecture skills.
  • Experience working with large operational datasets and business-critical workflows.
  • Ability to operate effectively in highly cross-functional environments and translate ambiguous operational challenges into scalable technical solutions.
  • Strong ownership mindset and ability to independently drive complex projects.

Nice to have

  • Experience building planning systems, forecasting platforms, optimization engines, or decision-support tools.
  • Experience with SQL, data warehouses, orchestration frameworks, analytics platforms, and distributed data systems.
  • Experience supporting infrastructure, cloud platforms, data centers, hardware deployment programs, or large-scale operational environments.
  • Familiarity with capacity planning, supply chain systems, financial modeling, or infrastructure operations.
  • Experience replacing spreadsheet-driven workflows with scalable software platforms.
  • Experience building systems that support scenario planning, forecasting, optimization, or resource allocation.

What the JD emphasized

  • Demand for compute is growing faster than traditional planning systems can support.
  • replace spreadsheet-driven workflows with scalable software systems
  • replace spreadsheet-driven workflows with scalable software platforms