Staff + Sr. Software Engineer, Cloud Inference

Anthropic Anthropic · AI Frontier · San Francisco, CA · Software Engineering - Infrastructure

This role focuses on building and optimizing backend services and infrastructure for serving large language models (LLMs) like Claude across multiple cloud service providers (CSPs). The engineer will be responsible for API integration, intelligent request routing, inference execution, capacity management, and day-to-day operations, ensuring reliability, cost-effectiveness, and performance at massive scale. The role involves cross-functional collaboration with internal teams and CSP partners, CI/CD automation, and analyzing observability data.

What you'd actually do

  1. Design, build, and own backend services and infrastructure that serve Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models
  2. Work cross-functionally with internal inference, product API, systems, and security teams, among others, and with CSP partners to stand up the full serving stack on new cloud platforms, resolve operational issues, and influence provider roadmaps
  3. Build and evolve CI/CD automation systems, including validation and deployment pipelines, that reliably ship new model versions to millions of users across cloud platforms without regressions
  4. Design interfaces and tooling abstractions across CSPs that enable cost-effective inference management, scale across providers, and reduce per-platform complexity
  5. Contribute to capacity planning, autoscaling, and workload routing strategies that match supply with demand and direct requests to the most cost-effective accelerator and region

Skills

Required

  • high-performance, large-scale distributed systems
  • building or operating services on at least one major cloud platform (AWS, GCP, or Azure)
  • Kubernetes
  • Infrastructure as Code
  • container orchestration
  • cross-functional collaboration
  • working with external partners
  • fast learner
  • autonomous
  • ownership of problems end-to-end

Nice to have

  • scaling infrastructure or products across multiple platforms
  • navigating differences in networking, security, privacy, billing, and managed service offerings
  • capacity management
  • cost optimization
  • resource planning at scale across heterogeneous environments
  • multi-region deployments
  • geographic routing
  • global traffic management
  • Python
  • Rust

What the JD emphasized

  • significant software engineering experience, with a strong background in high-performance, large-scale distributed systems serving millions of users
  • experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure), with exposure to Kubernetes, Infrastructure as Code, or container orchestration
  • Direct experience working with CSPs to scale infrastructure or products across multiple platforms, navigating differences in networking, security, privacy, billing, and managed service offerings
  • Hands-on experience with capacity management, cost optimization, or resource planning at scale across heterogeneous environments

Other signals

  • Serve Claude across multiple CSPs
  • Scale and optimize Claude
  • Increase the scale at which our services operate
  • Accelerate our ability to reliably launch new frontier models