Anthropic Fellows Program — Reinforcement Learning

Anthropic Anthropic · AI Frontier · BC +3 · Remote · AI Research & Engineering

This is a research fellowship program focused on Reinforcement Learning (RL) within AI safety. Fellows will work on empirical projects, potentially using external infrastructure, with the goal of producing public outputs like paper submissions. The program emphasizes mentorship from Anthropic researchers and provides a stipend and compute funding. Key activities include building model-based tools for data quality, understanding generalization, and creating RL environments for capabilities and safety tasks.

What you'd actually do

  1. Building model-based tools to better understand AI training data and improve training data quality
  2. A research project to better understand generalization
  3. Creating RL environments to improve Claude models at capabilities that are within your domain of expertise
  4. Building RL environments for safety-related tasks
  5. Conducting research and implementing solutions in areas such as RL algorithms

Skills

Required

  • Fluent in Python programming
  • Available to work full-time on the Fellows program

Nice to have

  • strong technical background in computer science, mathematics, or physics
  • strong software engineering skills with experience building complex ML systems
  • experience with training

What the JD emphasized

  • public output
  • paper submission
  • Fluent in Python programming
  • Available to work full-time on the Fellows program

Other signals

  • AI safety
  • reinforcement learning
  • research project
  • public output
  • paper submission