Research Engineer, Machine Learning (rl Velocity)

Anthropic Anthropic · AI Frontier · London, United Kingdom · AI Research & Engineering

The RL Velocity team owns the efficiency and reliability of the RL Science stack, building and improving the core platform for RL training runs to remove bottlenecks and enable faster iteration. This role focuses on ML infrastructure, distributed systems, and research tooling to improve the velocity and reliability of RL training at scale.

What you'd actually do

  1. Build and improve the RL training infrastructure that researchers depend on day-to-day
  2. Identify and remove bottlenecks across the RL stack: debugging, profiling, and rearchitecting where needed
  3. Partner closely with researchers and with adjacent engineering teams (inference, sandboxing, and many more) to understand pain points and ship tooling that makes them faster
  4. Own the reliability and performance of research runs end-to-end
  5. Contribute to design decisions that shape how Anthropic does RL at scale

Skills

Required

  • Software engineering fundamentals
  • Building performant and reliable systems
  • ML infrastructure
  • Distributed systems
  • Research tooling
  • Shipping and iterating quickly
  • High agency
  • Low ego

Nice to have

  • Large-scale distributed training (RL, pre-training, or post-training)
  • JAX
  • PyTorch
  • Operating at the edge of research and infra in a fast-moving environment

What the JD emphasized

  • strong software engineering fundamentals
  • track record of building performant, reliable systems
  • ML infrastructure
  • distributed systems
  • research tooling
  • shipping and iterating quickly
  • large-scale distributed training
  • fast-moving environment

Other signals

  • RL training infrastructure
  • research tooling
  • large-scale distributed training
  • ML infrastructure