Research Scientist / Engineer, Agentic Learning (horizons)

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Research Scientist/Engineer focused on developing and implementing novel finetuning techniques for language models to improve alignment properties like moral reasoning, honesty, and character. The role involves creating and maintaining evaluation frameworks, collaborating on production model integration, and automating scaling processes. Requires strong Python skills, ML training/experimentation experience, and analytical skills for interpreting results. Experience with language model finetuning, AI alignment research, and techniques like RLHF is preferred.

What you'd actually do

  1. Develop and implement novel finetuning techniques using synthetic data generation and advanced training pipelines
  2. Use these to train models to have better alignment properties including honesty, character, and harmlessness
  3. Create and maintain evaluation frameworks to measure alignment properties in models
  4. Collaborate across teams to integrate alignment improvements into production models
  5. Develop processes to help automate and scale the work of the team

Skills

Required

  • Python
  • ML model training
  • ML experimentation
  • implementing ML research
  • analytical skills
  • interpreting experimental results
  • ML metrics
  • evaluation frameworks
  • turning research ideas into working code
  • identifying and resolving practical implementation challenges

Nice to have

  • MS/PhD in Computer Science, ML, or related field, or equivalent experience
  • language model finetuning
  • AI alignment research
  • Published work in ML or alignment
  • synthetic data generation
  • RLHF
  • constitutional AI
  • reward modeling
  • designing and implementing novel training approaches
  • model behavior evaluation and improvement

What the JD emphasized

  • novel finetuning techniques
  • alignment properties
  • evaluation frameworks
  • ML model training and experimentation
  • implementing ML research
  • ML metrics and evaluation frameworks
  • turning research ideas into working code
  • language model finetuning
  • AI alignment research
  • synthetic data generation
  • RLHF
  • constitutional AI
  • reward modeling
  • novel training approaches
  • model behavior evaluation and improvement

Other signals

  • Develop and implement novel finetuning techniques
  • train models to have better alignment properties
  • Create and maintain evaluation frameworks to measure alignment properties