Software Engineer, Safeguards Foundations (internal Tooling)

Anthropic Anthropic · AI Frontier · London, United Kingdom · Safeguards (Trust & Safety)

Software Engineer focused on building and maintaining internal tooling for AI safety and human review processes at Anthropic. This role involves creating case management, labeling, investigation, and enforcement interfaces for analysts, directly impacting the speed and accuracy of identifying harmful behavior and feeding signal back into model training. The position emphasizes product-minded development for internal users, backend services, and integrating with detection and enforcement systems, with a strong focus on guardrails and metrics.

What you'd actually do

  1. Design, build, and maintain the internal review and enforcement tooling used by Safeguards analysts — including case queues, content review surfaces, decision/audit logging, and account-actioning workflows
  2. Understand user workflows and establish tooling for well processes that may be distributed across a number of tools and UIs
  3. Develop the ‘base layer’ of reusable APIs, data storage, and backend services that let new review workflows be stood up quickly and safely
  4. Partner with operations and policy teams to understand reviewer pain points, then translate them into clear product improvements that reduce handling time and decision error
  5. Integrate tooling with upstream detection systems and downstream enforcement infrastructure so that flagged behaviour flows cleanly from signal → human review → action

Skills

Required

  • 4+ years of experience as a software engineer
  • Meaningful time spent building internal tools, operations platforms, or back-office products
  • Comfortable using agentic coding tools (e.g. Claude Code) as a core part of your workflow
  • Directing agentic coding tools to ship well-tested, production-quality software at a high cadence without lowering the bar
  • Product-minded approach to internal users
  • Results-oriented, with a bias towards flexibility and impact
  • Communicate clearly with non-engineering stakeholders
  • Explain technical trade-offs to operations and policy partners
  • Care about the societal impacts of your work
  • Apply engineering skills directly to AI safety

Nice to have

  • Built tooling in a trust & safety, content moderation, fraud, integrity, or risk-operations setting
  • Designing case-management or workflow systems (queues, SLAs, escalation paths, audit logs)
  • Worked with sensitive data and understand the privacy, access-control, and reviewer-wellbeing considerations
  • Experience with GCP/AWS, Postgres/BigQuery, and CI/CD in a production environment
  • Used LLMs as a building block inside operational tools (e.g. assisted triage, summarisation, or classification in the review loop)

What the JD emphasized

  • internal tooling
  • human review
  • AI safety
  • internal users
  • reviewer pain points
  • detection systems
  • enforcement infrastructure
  • sensitive internal tools
  • reviewer wellbeing

Other signals

  • internal tooling
  • human review
  • AI safety