Technical Program Manager, Safeguards (infrastructure & Evals)

Anthropic Anthropic · AI Frontier · San Francisco, CA · Technical Program Management

Technical Program Manager for Safeguards Infrastructure and Evals at Anthropic. This role focuses on owning the operational health, reliability, and forward momentum of AI safety infrastructure, including classifiers, detection pipelines, evaluation platforms, and monitoring systems. Responsibilities include driving incident response, post-mortem execution, establishing and maintaining SLOs with partner teams, maintaining runbook quality, managing platform migrations, and coordinating improvements to the evals platform. Requires technical depth in production ML systems and strong program management skills in operational and infrastructure-heavy environments.

What you'd actually do

  1. Own the Safeguards Engineering ops review - Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures, bringing visibility to reliability trends, and making sure the right people are in the room when decisions need to be made. This is the heartbeat of how Safeguards Eng stays ahead of operational risk.
  2. Drive incident tracking and post-mortem execution - When incidents happen — and in this space, they happen regularly — you'll make sure they get followed through properly. That means tracking incidents across the organization (including those owned by partner teams like Inference), ensuring post-mortems get written, and most critically, making sure the action items that come out of them actually get done. Closing the loop on post-mortem actions is one of the highest-leverage things this role does.
  3. Establish and maintain SLOs with partner teams - Work with Safeguards Engineering teams and key partners — particularly Inference and Cloud Inference — to define service-level objectives for safety-critical pipelines. Then build the tracking and reporting that makes it possible to tell whether those SLOs are being met, and surface it when they're not.
  4. Maintain runbook quality and incident-ownership clarity - Safety-critical systems need clear playbooks for when things go wrong. Partner with engineering leads to keep runbooks accurate, actionable, and up to date — and ensure that ownership of incidents (including for areas like account-banning false positives and CSAM detection) is unambiguous so that nothing falls through the cracks during an active incident.
  5. Drive platform migrations and infrastructure projects - Own the program management for the larger infrastructure work on the roadmap: migrating the infra from one platform to the next, moving from one incident platform to the next and from one cloud system monitoring to another, and other migrations as they come. These are cross-team efforts with real dependencies — your job is to keep them sequenced, on track, and connected to the teams that need them.
  6. Coordinate evals platform improvements - Partner with the evals engineering team to drive improvements to the evaluation platform — including self-serve capabilities and the broader eval factory infrastructure. Help scope the work, track dependencies on other Safeguards systems, and make sure the evals platform is keeping pace with the team's needs.

Skills

Required

  • Technical program management experience, particularly in operational or infrastructure-heavy environments
  • Understanding of production ML systems for incident triage and technical conversations
  • Process building and follow-up for closing operational loops
  • Cross-team coordination and influence
  • Ability to context-switch between operational and project work
  • Experience with or strong interest in AI safety

Nice to have

  • SRE practices
  • incident management frameworks
  • on-call operations at scale
  • evaluation infrastructure

What the JD emphasized

  • reliability
  • incident-response
  • post-mortem
  • SLOs
  • evals platform
  • infrastructure

Other signals

  • reliability
  • incident response
  • SLOs
  • evals platform
  • infrastructure