Team Manager, Alignment RL

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Manager for a team developing and implementing AI alignment techniques, focusing on improving model values and behavior for hard-to-evaluate tasks. The role involves driving execution of alignment initiatives, supporting team growth, and ensuring collaboration across research. Key activities include implementing and scaling techniques like oversight, synthetic data generation, and training models to assist in model training, aiming to accelerate the deployment of alignment advances into frontier models.

What you'd actually do

  1. Partner with the research lead to develop and execute the team’s roadmap
  2. Build and improve processes for evaluating the effectiveness of the team’s alignment interventions
  3. Coordinate cross-functional collaboration between Alignment Finetuning and other teams like T&S, Applied Finetuning, and Alignment Science
  4. Support the development and growth of researchers and engineers working on novel alignment techniques
  5. Drive recruiting efforts to grow the team while maintaining high standards

Skills

Required

  • 5+ years of technical experience in software engineering, ML/AI, or related field
  • 2+ years of experience managing technical teams
  • excellent listener and communicator
  • Take ownership over your team's overall output and performance
  • experience supporting and enabling research teams
  • Build strong relationships across various stakeholder groups
  • demonstrated ability to understand and support technical work
  • Care deeply about AI safety and alignment

Nice to have

  • experience with ML/AI projects and understanding of fundamental concepts
  • background working with research organizations
  • experience managing research or exploratory projects
  • experience with org design and process improvement
  • experience recruiting for and managing teams through periods of growth
  • familiarity with reinforcement learning and language models
  • experience with alignment, reward modeling, or synthetic data

What the JD emphasized

  • alignment techniques
  • model values and behavior
  • hard-to-evaluate tasks
  • oversight
  • synthetic data generation
  • training models to assist in model training
  • accelerate the deployment of alignment advances into frontier models
  • AI safety and alignment

Other signals

  • alignment techniques
  • model values and behavior
  • hard-to-evaluate tasks
  • oversight
  • synthetic data generation
  • training models to assist in model training
  • accelerate the deployment of alignment advances into frontier models