Engineering Manager, Core Services

OpenAI OpenAI · AI Frontier · San Francisco, CA · Applied AI

Engineering Manager for Core Services at OpenAI, leading teams responsible for highly reliable, high-scale distributed systems that are critical to OpenAI products. The role involves managing and growing infrastructure engineers, building and operating production platforms, setting technical direction for platform foundations, and partnering with various stakeholders.

What you'd actually do

  1. Managing and growing a high-performing, team of infrastructure engineers.
  2. Leading teams building and operating large, critical production platforms, including cluster reliability, scaling, and rollout safety.
  3. Building and operating mission-critical distributed systems with strong operational rigor (SLOs, incident response, capacity planning, reliability).
  4. Setting technical direction for platform foundations such as workflow/orchestration capabilities, large-scale file/blob/storage services, and core service foundations.
  5. Partnering with a broad set of stakeholders, including product engineering, adjacent infrastructure teams, and (where relevant) finance/cost partners.

Skills

Required

  • leading teams
  • mission-critical infrastructure
  • production systems
  • distributed systems
  • operational rigor
  • SLOs
  • incident response
  • capacity planning
  • reliability
  • platform systems
  • orchestration/workflow execution
  • service platforms
  • large-scale storage/blob/file infrastructure
  • multi-cloud environments
  • scalable systems
  • ambiguity
  • competing priorities

Nice to have

  • coaching
  • mentoring
  • developing engineers
  • emerging leaders