Research Engineer, Machine Learning (reinforcement Learning)

Anthropic Anthropic · AI Frontier · New York, NY +1 · AI Research & Engineering

Research Engineer focused on Reinforcement Learning to advance capabilities and safety of large language models. This role involves implementing novel approaches, contributing to research direction, creating agentic models via tool use for tasks like computer use and autonomous software generation, and improving reasoning abilities. Projects include architecting RL infrastructure, designing training environments and evaluations for RL agents, driving performance improvements, and developing automated testing frameworks.

What you'd actually do

  1. Architect and optimize core reinforcement learning infrastructure, from clean training abstractions to distributed experiment management across GPU clusters. Help scale our systems to handle increasingly complex research workflows.
  2. Design, implement, and test novel training environments, evaluations, and methodologies for reinforcement learning agents which push the state of the art for the next generation of models.
  3. Drive performance improvements across our stack through profiling, optimization, and benchmarking. Implement efficient caching solutions and debug distributed systems to accelerate both training and evaluation workflows.
  4. Collaborate across research and engineering teams to develop automated testing frameworks, design clean APIs, and build scalable infrastructure that accelerates AI research.

Skills

Required

  • Python
  • async/concurrent programming
  • Trio
  • machine learning frameworks (PyTorch, TensorFlow, JAX)
  • industry experience in machine learning research
  • balance research exploration with engineering implementation
  • code quality
  • testing
  • performance
  • systems design
  • communication skills

Nice to have

  • LLM architectures
  • LLM training methodologies
  • reinforcement learning techniques
  • virtualization
  • sandboxed code execution environments
  • Kubernetes
  • distributed systems
  • high-performance computing
  • Rust
  • C++

What the JD emphasized

  • reinforcement learning research
  • agentic models
  • tool use
  • autonomous software generation
  • improving reasoning abilities

Other signals

  • Reinforcement Learning
  • agentic models
  • tool use
  • autonomous software generation
  • improving reasoning abilities
  • large language models