Engineering Manager, API Core Capabilities

Anthropic Anthropic · AI Frontier · San Francisco, CA · Engineering & Design - Product

Engineering Manager for Anthropic's API Core team, responsible for the critical request lifecycle preceding every inference call. This role focuses on optimizing service efficiency, throughput scaling, and managing rate limiting/fairness systems to maximize Claude API's capacity and reliability. It involves setting technical direction, driving delivery, and collaborating across infrastructure, inference, and product teams.

What you'd actually do

  1. Own all aspects of the API Core team—hiring, performance management, career development, and overall org health.
  2. Lead the technical strategy and delivery roadmap for the API hot path: service efficiency, throughput scaling, and the rate-limiting and acceleration-limit systems that govern access to inference capacity.
  3. Drive multi-quarter initiatives to improve token-path efficiency and reduce per-request overhead, protocol-level optimizations, and architectural changes to the request pipeline to serve the next generation APIs to serve our models.
  4. Partner with Inference and Compute on capacity planning, regional load balancing, and the engineering response to capacity-constrained periods—including translating compute forecasts into concrete API-tier roadmap commitments.
  5. Own the rate-limiting and acceleration-limit subsystems end-to-end: quota models, enforcement, fairness across tiers, and the operational tooling that GTM and Support rely on.

Skills

Required

  • 10+ years experience managing engineering teams
  • Experience building high-throughput, latency-sensitive backend services or developer platforms
  • Track record of leading teams that have delivered measurable service-efficiency or throughput improvements on systems operating at significant scale
  • Depth in systems-level performance engineering (Rust, Go, profiling, allocator and runtime tuning)
  • Depth in large-scale distributed systems (load balancing, rate limiting, backpressure, request routing)
  • Depth in API platform design
  • Comfort making architectural calls under capacity pressure
  • Strong communication and partnership skills with non-engineering stakeholders
  • Ability to build high-performing teams
  • Ability to operate effectively at the intersection of technical complexity and business urgency

What the JD emphasized

  • high-throughput, latency-sensitive backend services
  • measurable service-efficiency or throughput improvements
  • systems operating at significant scale (millions of QPS, multi-region, capacity-constrained)
  • systems-level performance engineering (Rust, Go, profiling, allocator and runtime tuning)
  • large-scale distributed systems (load balancing, rate limiting, backpressure, request routing)
  • API platform design