Staff Software Engineer, Inference

Anthropic Anthropic · AI Frontier · Dublin, Ireland · Software Engineering - Infrastructure

Staff Software Engineer on the Inference team responsible for building and maintaining systems that serve Claude to millions of users. Focuses on maximizing compute efficiency and providing high-performance inference infrastructure for research, tackling complex distributed systems challenges across diverse AI accelerators.

What you'd actually do

  1. Work end to end, identifying and addressing key infrastructure blockers to serve Claude to millions of users while enabling breakthrough AI research.
  2. Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators
  3. Autoscaling our compute fleet to dynamically match supply with demand across production, research, and experimental workloads
  4. Building production-grade deployment pipelines for releasing new models to millions of users
  5. Integrating new AI accelerator platforms to maintain our hardware-agnostic competitive advantage

Skills

Required

  • Significant software engineering experience, particularly with distributed systems
  • Performance optimization
  • Distributed systems
  • Large-scale service orchestration
  • Intelligent request routing
  • Python or Rust

Nice to have

  • Familiarity with LLM inference optimization, batching strategies, and multi-accelerator deployments
  • Implementing and deploying machine learning systems at scale
  • Load balancing, request routing, or traffic management systems
  • LLM inference optimization, batching, and caching strategies
  • Kubernetes and cloud infrastructure (AWS, GCP)

What the JD emphasized

  • maximizing compute efficiency
  • enabling breakthrough research
  • high-performance inference infrastructure
  • LLM inference optimization
  • large-scale distributed systems

Other signals

  • Serve Claude to millions of users worldwide
  • Maximizing compute efficiency
  • Enabling breakthrough research
  • High-performance inference infrastructure
  • Large-scale distributed systems
  • LLM inference optimization