Research Engineer, Virtual Collaborator

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Research Engineer focused on training Claude for virtual collaborator workflows using reinforcement learning, data pipelines, and integrating real organizational data. The role involves designing RL environments, scaling data creation, integrating enterprise data, developing evaluation systems, and training Claude on document manipulation, with a focus on enterprise AI applications.

What you'd actually do

  1. Designing and implementing reinforcement learning pipelines specifically targeted at virtual collaborator use cases (productivity, organizational navigation, vertical domains)
  2. Building and scaling our data creation platform for generating high-quality, open-ended tasks with domain experts and crowdworkers
  3. Integrating real organizational data to create authentic training environments
  4. Developing robust rubric-based evaluation systems that maintain quality while avoiding reward hacking
  5. Training Claude on advanced document manipulation, including understanding, enhancing, and co-creating

Skills

Required

  • Python programming
  • machine learning research experience
  • reinforcement learning
  • fine-tuning
  • pragmatic approach to solving real-world problems
  • balancing research rigor with shipping deadlines
  • collaborating across multiple teams

Nice to have

  • human-in-the-loop training systems
  • crowdsourcing platforms
  • enterprise tools and APIs (Google Workspace, Microsoft Office, Slack, etc.)
  • evaluation frameworks for open-ended tasks
  • domain expertise in finance, legal, or healthcare workflows
  • scalable data pipelines with quality control mechanisms
  • reward modeling
  • preventing reward hacking in RL systems
  • Translating product requirements into technical training objectives

What the JD emphasized

  • reinforcement learning
  • fine-tuning
  • reward modeling
  • preventing reward hacking

Other signals

  • training Claude specifically for virtual collaborator workflows
  • design and implement reinforcement learning environments
  • transform Claude into the best virtual collaborator
  • training on everything from navigating internal knowledge to creating financial models