Research Engineer / Scientist, Tool Use Safety

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Research Engineer/Scientist focused on advancing the frontier of safe tool use in AI models, specifically addressing prompt injection, data exfiltration, adversarial attacks, and autonomous agent behavior with large tool sets. The role involves designing and implementing RL methodologies, building evaluations, and shipping research advances into production models, with a strong emphasis on safety and reliability.

What you'd actually do

  1. Design and implement novel and scalable reinforcement learning methodologies that push the state of the art of tool use safety
  2. Define and pursue research agendas that push the boundaries of what's possible
  3. Build rigorous, realistic evaluations that capture the complexity of real-world tool use safety challenges
  4. Ship research advances that directly impact and protect millions of users
  5. Collaborate with other safety research (e.g. Safeguards, Alignment Science), capabilities research, and product teams to drive fundamental breakthroughs in safety, and work with teams to ship these into production

Skills

Required

  • Python
  • machine learning research
  • applied-research
  • quantitative background
  • software engineering skills
  • communication of complex ideas

Nice to have

  • tool use/agentic safety
  • trust & safety
  • security
  • reinforcement learning techniques
  • language model training
  • fine-tuning
  • evaluation
  • AI agents
  • autonomous systems
  • published influential work in ML
  • LLM safety & alignment
  • deep expertise in RL, security, or mathematical foundations
  • shipping features
  • working closely with product teams
  • pair programming
  • collaborative research

What the JD emphasized

  • tool use safety
  • prompt injection
  • data exfiltration
  • adversarial attacks
  • autonomous
  • reinforcement learning methodologies
  • tool use safety
  • real-world tool use safety challenges
  • ship research advances
  • safety research
  • capabilities research
  • product teams
  • fundamental breakthroughs in safety
  • ship these into production
  • ML stacks
  • pair programming
  • technical discussions
  • team problem-solving
  • safety mission
  • real-world impact
  • research ship in production
  • machine learning research/applied-research experience
  • quantitative background
  • software engineering skills
  • Communicate complex ideas clearly
  • hungry to learn and grow
  • tool use/agentic safety
  • trust & safety
  • security
  • reinforcement learning techniques
  • language model training
  • fine-tuning
  • evaluation
  • AI agents
  • autonomous systems
  • Published influential work
  • LLM safety & alignment
  • Deep expertise
  • shipping features
  • product teams
  • pair programming
  • collaborative research

Other signals

  • tool use safety
  • agentic applications
  • prompt injection robustness
  • long horizon & complex tool use workflow
  • large scale & dynamic tools
  • tool use efficiency
  • autonomous agents