Research Engineer, Virtual Collaborator (cowork)

Anthropic Anthropic · AI Frontier · New York, NY +2 · AI Research & Engineering

Research Engineer focused on training Claude for virtual collaborator workflows, involving RL environments, data creation, and evaluation systems for enterprise use cases.

What you'd actually do

  1. Training Claude on document manipulation with good taste, including understanding, enhancing, and co-creating (e.g., Office doc formats, data visualization)
  2. Designing and implementing reinforcement learning pipelines targeted at virtual collaborator use cases (productivity, organizational navigation, vertical domains)
  3. Building and scaling our data creation platform for generating high-quality, open-ended tasks with domain experts and crowdworkers
  4. Developing robust evaluation systems that maintain quality while avoiding reward hacking
  5. Partnering directly with product teams (e.g., Cowork, claude.ai) to ensure training aligns with product features

Skills

Required

  • Python programming
  • Machine Learning
  • Reinforcement Learning
  • Data Creation Platform Development
  • Evaluation Systems Development
  • Collaboration

Nice to have

  • RL environments for realistic tasks
  • Reward modeling
  • Human-in-the-loop training systems
  • Crowdsourcing platforms
  • Enterprise tools and APIs (Google Workspace, Microsoft Office, Slack)
  • Evaluation frameworks for open-ended tasks
  • Finance, legal, or healthcare workflows
  • Scalable data pipelines with quality control
  • Translating product requirements into technical training objectives

What the JD emphasized

  • 5-8 years of strong machine learning experience
  • very experienced Python programmer
  • thrive at the intersection of research and product
  • comfortable with ambiguity
  • balance research rigor with shipping deadlines
  • care about making AI genuinely helpful for everyday enterprise workflows

Other signals

  • training Claude on document manipulation
  • designing and implementing reinforcement learning pipelines
  • building and scaling our data creation platform
  • developing robust evaluation systems
  • partnering directly with product teams