Senior Software Engineer, Inference

Anthropic Anthropic · AI Frontier · Dublin, Ireland · Software Engineering - Infrastructure

Senior Software Engineer on the Inference team responsible for building and maintaining systems that serve Claude models to millions of users. Focuses on maximizing compute efficiency and providing high-performance inference infrastructure for research.

What you'd actually do

  1. Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators
  2. Autoscaling our compute fleet to dynamically match supply with demand across production, research, and experimental workloads
  3. Building production-grade deployment pipelines for releasing new models to millions of users
  4. Integrating new AI accelerator platforms to maintain our hardware-agnostic competitive advantage
  5. Contributing to new inference features (e.g., structured sampling, prompt caching)

Skills

Required

  • significant software engineering experience
  • distributed systems
  • Python or Rust

Nice to have

  • High-performance, large-scale distributed systems
  • Implementing and deploying machine learning systems at scale
  • Load balancing, request routing, or traffic management systems
  • LLM inference optimization, batching, and caching strategies
  • Kubernetes and cloud infrastructure (AWS, GCP)

What the JD emphasized

  • critical systems that serve Claude
  • industry's largest compute-agnostic inference deployments
  • maximizing compute efficiency
  • explosive customer growth
  • enabling breakthrough research
  • high-performance inference infrastructure
  • complex, distributed systems challenges
  • multiple accelerator families and emerging AI hardware
  • multiple cloud platforms
  • large-scale distributed systems
  • machine learning systems at scale
  • LLM inference optimization
  • production-grade deployment pipelines
  • new AI accelerator platforms
  • new inference features
  • new model architectures
  • observability data to tune performance
  • real-world production workloads
  • multi-region deployments
  • global customers

Other signals

  • serving Claude to millions of users worldwide
  • maximizing compute efficiency
  • enabling breakthrough research
  • high-performance inference infrastructure