Research Engineer, Performance RL

Anthropic Anthropic · AI Frontier · San Francisco, CA · AI Research & Engineering

Research Engineer focused on Reinforcement Learning for code generation and accelerator performance, aiming to improve model reasoning and coding capabilities. The role involves inventing RL environments, conducting experiments, shaping research roadmaps, and delivering work into training runs, with a strong emphasis on collaboration and scaling research innovations.

What you'd actually do

  1. Invent, design and implement RL environments and evaluations.
  2. Conduct experiments and shape our research roadmap.
  3. Deliver your work into training runs.
  4. Collaborate with other researchers, engineers, and performance engineering specialists across and outside Anthropic.

Skills

Required

  • Expertise with accelerators (CUDA, ROCm, Triton, Pallas)
  • ML framework programming (JAX or PyTorch)
  • Experience across the stack – kernels, model code, distributed systems
  • Ability to balance research exploration with engineering implementation
  • Experience with reinforcement learning
  • Experience porting ML workloads between different types of accelerators
  • Familiarity with LLM training methodologies

Nice to have

  • Invent, design and implement RL environments and evaluations
  • Conduct experiments and shape our research roadmap
  • Deliver your work into training runs
  • Collaborate with other researchers, engineers, and performance engineering specialists

What the JD emphasized

  • accelerator performance
  • RL environments and evaluations
  • research roadmap
  • training runs
  • accelerators (CUDA, ROCm, Triton, Pallas)
  • ML framework programming (JAX or PyTorch)
  • kernels, model code, distributed systems
  • research exploration with engineering implementation
  • reinforcement learning
  • porting ML workloads between different types of accelerators
  • LLM training methodologies

Other signals

  • Develop systems that enable models to use computers effectively
  • Advance code generation through reinforcement learning
  • Pioneer fundamental RL research for large language models
  • Build scalable RL infrastructure and training methodologies
  • Enhance model reasoning capabilities
  • Deliver your work into training runs