Data Scientist, Infrastructure

OpenAI OpenAI · AI Frontier · San Francisco, CA · Data Science

Data Scientist on the Infra team responsible for scaling the infrastructure that powers OpenAI’s products and research. This involves defining foundational datasets, developing metrics, building forecasting and optimization models, and establishing dashboards to improve infra usage and efficiency.

What you'd actually do

  1. Build and maintain foundational datasets and metrics that reflect infrastructure usage, efficiency, and scaling.
  2. Develop forecasting and optimization models to support infra planning and resource allocation.
  3. Partner with engineering, research, and product teams to shape infrastructure strategy through data.
  4. Drive clarity with source-of-truth dashboards and analyses that guide infra decisions across OpenAI.

Skills

Required

  • SQL
  • Python
  • Data analysis
  • Metrics definition
  • Forecasting
  • Optimization modeling
  • Infrastructure domains
  • Systems
  • Platform domains

Nice to have

  • NLP
  • Large language models
  • Generative AI
  • Backend systems
  • Simulations
  • Prototyping

What the JD emphasized

  • 5+ years of experience in a quantitative role navigating ambiguous environments, ideally in infrastructure, systems, or platform domains at a high-growth company or research org
  • Experience defining and operationalizing metrics that reflect system performance, resource usage, or efficiency from the ground up
  • A strong foundation in SQL and Python, and a track record of building models and analyses that drive technical and strategic decisions

Other signals

  • scaling infrastructure
  • forecasting and optimization models
  • infrastructure measurement, planning, scaling, allocation, and efficiency