Senior Support Engineer - San Francisco

OpenAI OpenAI · AI Frontier · San Francisco, CA · User Operations

This role focuses on providing technical support and guidance to enterprise customers using OpenAI's API platform. The Senior Support Engineer will be responsible for resolving complex issues, designing operational processes for monitoring and incident response, and leveraging AI technologies to scale support operations. The role acts as a last line of defense before the core engineering team and requires strong troubleshooting, monitoring, and incident management skills.

What you'd actually do

  1. Be among the foremost technical and troubleshooting experts for our API platform at OpenAI. You are the last line of defense before the core Engineering team.
  2. Proactively identify and implement opportunities to scale support operations by leveraging automation and advancements in AI technologies. Contribute to shaping the future of technical support in an AI-driven era.
  3. Configure and use advanced monitoring and alerting workflows to proactively detect customer impacting issues in real time.
  4. In partnership with engineering, contribute to reliability reviews and preparedness for new features, launches, or strategic customer requirement updates. Ensure that operational readiness (monitoring, alerting, and fallback plans) is in place for any such changes.
  5. Design and refine incident response processes and documentation across strategic customers, engineering and support teams.

Skills

Required

  • 8+ years of experience in technical operations roles such as SRE/NOC
  • Designing monitoring systems
  • Resolving production issues
  • Troubleshooting complex technical problems at the systems level
  • Modern monitoring, alerting, and observability practices
  • Hands‑on experience setting up or managing metrics, logging, and tracing for distributed systems
  • Leading incident response for high‑severity outages or service disruptions
  • Real‑time incident coordination
  • Root cause analysis
  • Scripting or software engineering (e.g., Python or similar)
  • Solid understanding of cloud infrastructure and distributed systems fundamentals
  • Working with cloud services, load balancers, databases, and containerized applications
  • Working cross‑functionally
  • Communication skills

Nice to have

  • Bachelor’s degree in Computer Science or a related field
  • Strong software engineering foundation

What the JD emphasized

  • mission critical solutions
  • complex issues
  • technically difficult issues
  • high difficulty
  • mission-critical environments
  • high-severity outages