Engineering Manager, Agent Runtime Platform

Anthropic Anthropic · AI Frontier · New York, NY +1 · Software Engineering - Infrastructure

Engineering Manager for the Agent Runtime Platform, responsible for provisioning and running compute and runtimes for internal agents. The role involves owning technical strategy, managing execution, prioritizing work, building and shipping the runtime platform and tooling, defining secure agent execution, building reusable primitives, and ensuring infrastructure scalability and reliability. Requires strong experience in building and operating large-scale platforms/distributed systems and technical management, with a focus on agent platforms and security-sensitive infrastructure.

What you'd actually do

  1. Own the technical strategy and roadmap, translating goals into concrete execution
  2. Manage day-to-day execution of the team's work
  3. Prioritize the team’s work and manage projects in a highly dynamic, fast-paced environment
  4. Stay hands-on: build and ship the runtime platform and tooling
  5. Define what secure agent execution means at scale, partnering with security teams on sandboxing, isolation, and credential management

Skills

Required

  • 10+ years building and operating large-scale platforms or distributed systems
  • 1+ years of management experience in a technical environment
  • experience building platforms or agents at scale
  • experience with container or VM orchestration at scale
  • excellent communication skills

Nice to have

  • Engineering management experience on top of a strong IC track record
  • Experience with harness engineering
  • Experience with building security-sensitive infrastructure: sandboxing, workload isolation, credential management, or access control
  • Experience with capacity planning and utilization across multiple cloud providers
  • Experience working with researchers, engineers and other functional roles

What the JD emphasized

  • building and operating large-scale platforms or distributed systems
  • building platforms or agents at scale
  • building security-sensitive infrastructure
  • capacity planning and utilization across multiple cloud providers

Other signals

  • building and shipping the runtime platform and tooling
  • defining secure agent execution at scale
  • building primitives that are composable and reusable
  • infrastructure scalability and reliability
  • capacity planning across cloud providers
  • right-sizing
  • operational excellence