Currently tracking 124 active AI roles, with 106 new openings in the last 4 weeks. Primary focus: Agent · Engineering. Salary range $46k–$850k (avg $405k).
Anthropic has 145 active AI-related job listings. The majority of these roles are focused on agents, comprising 28% of the total. Engineering is the most frequent function, with 74 listings, followed by Research with 51. The company is primarily hiring in the United States, with 118 positions, and the United Kingdom, with 22. Frequent tech tags include model_serving, evals, and agent_orchestration, suggesting a focus on deployment and evaluation of AI systems. In the last 30 days, Anthropic posted 16 new AI roles, a 47% decrease compared to the previous 30-day period.
Anthropic currently has 132 active AI-related roles in our index. The most common open titles are: Applied AI Architect, Industries (2), Regional Research Economist, Economic Research (2), Research Engineer, Machine Learning (RL Velocity) (2), Research Engineer, Production Model Post-Training (2), Staff Software Engineer, AI Reliability Engineering (2). Most positions are in Engineering and Research.
Anthropic's active AI hiring is concentrated in: agents (28%), serving infrastructure (17%), post-training (14%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Anthropic is hiring AI talent in: United States (106 roles), United Kingdom (20 roles), Canada (6 roles), Ireland (5 roles).
Job postings at Anthropic most frequently reference: model serving, evals, llm observability, agent orchestration, inference infra.
In the past 30 days, Anthropic has posted 29 new AI-related roles. That is a +61% change versus the prior 30 days (18 → 29).
| Title | Stage | AI score |
|---|---|---|
| Research Engineer, Frontier Red Team (Autonomy) Research Engineer focused on building and evaluating autonomous AI systems and defensive agents to counter adversarial AI, with a focus on cyberphysical risks and AI safety. This role involves creating model organisms, developing defensive agents, and translating findings into policy-relevant demonstrations. | AgentEval Gate | 10 |
| Research Scientist, Interpretability Research Scientist focused on mechanistic interpretability of LLMs, aiming to understand how trained models work by reverse-engineering their parameters and algorithms. The role involves developing methods, designing experiments, creating interpretability features, building infrastructure, and collaborating with other teams. Requires strong scientific research background with some interpretability work, comfort with experimental science, and proficiency in Python. |
| Post-train |
| 10 |
| [Expression of Interest] Research Manager, Interpretability Research Manager for the Interpretability team, focusing on mechanistic interpretability to understand how large language models work internally and ensure AI safety. The role involves partnering with a research lead on direction, project planning, execution, hiring, and people development, translating research ideas into tangible goals, and overseeing their execution. This is a management role, distinct from individual contributor research scientist or engineer roles. | Post-train | 10 |
| Research Engineer, Pretraining Scaling Research Engineer focused on training production pretrained models at scale, involving performance optimization, debugging, experimental design, and incident response during model launches. The role bridges research and engineering, working across the full training stack. | Pretrain | 10 |
| Research Engineer/Research Scientist, Pre-training Research Engineer/Scientist focused on pre-training large language models, involving research in model architecture, algorithms, data processing, and optimizer development, as well as optimizing and scaling training infrastructure. | Pretrain | 10 |
| Staff Research Engineer, Discovery Team Staff Research Engineer focused on building AI systems capable of scientific discovery and long-horizon reasoning, working across the full model stack from training to inference and agentic systems. | PretrainAgent | 10 |
| Research Engineer, Machine Learning (Reinforcement Learning) Research Engineer focused on Reinforcement Learning to advance capabilities and safety of large language models. This role involves implementing novel approaches, contributing to research direction, creating agentic models via tool use for tasks like computer use and autonomous software generation, and improving reasoning abilities. Projects include architecting RL infrastructure, designing training environments and evaluations for RL agents, driving performance improvements, and developing automated testing frameworks. | Post-trainAgent | 10 |
| Research Engineer, Domain Scaling Research Engineer focused on scaling AI models for real-world knowledge work in domains like finance, healthcare, and legal. This role involves owning the end-to-end data strategy, from sourcing tasks to RL training, including designing reward signals, managing external data vendors, and developing QA frameworks to ensure environment quality and prevent reward hacking. It combines applied research with hands-on data work. | DataPost-train | 9 |
| Staff+ Software Engineer, Inference Runtime Staff+ Software Engineer for Anthropic's Inference Runtime team, focusing on the accelerator-agnostic core of their AI inference serving stack. The role involves setting technical direction, owning the architecture and roadmap, hands-on coding in Rust/Python, optimizing accelerator usage, and building validation systems. Requires deep systems engineering or ML infrastructure background with experience in performance optimization and large-scale distributed systems. | Serve | 9 |
| Research Engineer, Code RL (Reinforcement Learning) Research Engineer focused on Reinforcement Learning for code generation, aiming to improve models' ability to write, edit, test, debug, and ship software. This role involves designing RL environments, building reward signals, running training experiments, and improving pipeline efficiency, blending research with engineering implementation. | Post-trainAgent | 9 |
| Software Engineer, Safeguards Evals Software Engineer role focused on building and owning the evaluation infrastructure for an agentic investigation system. This involves designing experiments, constructing high-quality eval datasets, measuring agent performance, analyzing coverage gaps, and productionizing research into release pipelines. The role also involves building tooling for policy experts and constructing RL environments to improve safety investigation capabilities. | AgentEval Gate | 9 |
| Product Manager, Claude Code Model Performance Product Manager for Anthropic's Claude Code Model Performance team, responsible for driving model launches, building agentic evals, and translating research improvements into developer-facing outcomes. Requires experience building agentic evals, a systems thinking approach, and comfort with both research and engineering. | AgentEval Gate | 9 |
| Research Scientist, Life Sciences Research Scientist role focused on improving AI model capabilities for life sciences tasks. This involves building agentic tools, designing evaluation benchmarks, and applying post-training techniques to enhance model performance on scientific workflows like bioinformatics, database queries, and literature synthesis. The role bridges ML, software engineering, and biology to make AI a better research assistant in life sciences. | Post-trainAgent | 9 |
| Technical Program Manager, Discovery Technical Program Manager on the Discovery team, owning systems and programs that determine research velocity, including compute planning, scientific RL environment health, and vendor pipelines. Requires ML engineering or research background with program leadership experience, technical depth to debug pipelines and analyze RL transcripts, and organizational effectiveness to coordinate across research, infrastructure, product, and data operations. | Post-trainData | 9 |
| Research Engineer, Search and Knowledge Post-Training Research Engineer focused on advancing search and knowledge capabilities in LLMs through post-training techniques. The role involves defining research hypotheses, designing experiments, building instrumentation for controlled studies, developing evaluations to distinguish reasoning from pattern matching, and driving optimization rigor. It sits at the intersection of RL, retrieval, and evaluation, aiming to make LLMs trustworthy searchers. | Post-trainAgent | 9 |
| Technical Program Manager, Research This role is a Technical Program Manager for Anthropic's research organization. The TPM will define and build programs for research teams, focusing on areas like compute, evals, and RL environments. They will drive end-to-end execution of complex research initiatives, establish processes, and ensure operational health of RL environments. The role requires a background in ML research or engineering, experience building technical programs from scratch, and the ability to navigate ambiguity in fast-moving research environments. | Post-trainData | 9 |
| Research Engineer, Model Evaluations Research Engineer focused on building and operating the evaluation infrastructure for large language models, ensuring their capabilities, knowledge, and safety properties are rigorously measured and validated at scale. This role involves designing evaluations, building distributed systems for running them, monitoring model health during training, and partnering with researchers to interpret results. | Eval GatePost-train | 9 |
| Research Engineer, RL Infrastructure (Knowledge Work) Research Engineer focused on the reliability, observability, and infrastructure of training environments and evaluation systems for AI models, ensuring stability and quality as they scale. The role involves proactive hardening, building tooling for early problem detection, and serving as a dedicated owner for environment health and evaluation integrity. | Eval GateData | 9 |
| Research Engineer, Machine Learning (RL Velocity) Research Engineer focused on building and improving the RL training infrastructure and tooling at Anthropic. The role involves identifying and removing bottlenecks in the RL stack, partnering with researchers and other engineering teams, and owning the reliability and performance of research runs to enable faster iteration and shipping of better models at scale. | DataPost-train | 9 |
| Research Engineer, Safeguards Labs Research Engineer focused on AI safety, investigating novel methods for detecting misuse, strengthening model safeguards, and building evaluation methodologies for AI systems, particularly in agentic workflows. The role involves leading research projects, designing offline analyses, developing prototypes, and collaborating with production teams. | Eval GatePost-train | 9 |
| Anthropic STEM Fellow This role is for a STEM Fellow to work alongside Anthropic's research teams for a few months. Fellows will use their domain expertise to evaluate, improve, and apply Claude's capabilities in their field. This involves designing evaluations, identifying data/techniques for capability gaps, and applying Claude to open problems using various strategies and tools. Projects are scoped to ship within the fellowship period. | Eval GateAgent | 9 |
| Anthropic Fellows Program — Reinforcement Learning This is a research fellowship program focused on Reinforcement Learning (RL) within AI safety. Fellows will work on empirical projects, potentially using external infrastructure, with the goal of producing public outputs like paper submissions. The program emphasizes mentorship from Anthropic researchers and provides a stipend and compute funding. Key activities include building model-based tools for data quality, understanding generalization, and creating RL environments for capabilities and safety tasks. | Post-train | 9 |
| Anthropic Fellows Program — AI Safety This is a research fellowship program focused on AI safety, aiming to foster talent in empirical AI research. Fellows will work on projects aligned with Anthropic's research priorities, using external infrastructure and external models, with the goal of producing public outputs like paper submissions. Key research areas include Scalable Oversight, Adversarial Robustness and AI Control, Model Organisms, Model Internals / Mechanistic Interpretability, and AI Welfare. | Post-train | 9 |
| Research Engineer, Performance RL Research Engineer focused on Reinforcement Learning for code generation and accelerator performance, aiming to improve model reasoning and coding capabilities. The role involves inventing RL environments, conducting experiments, shaping research roadmaps, and delivering work into training runs, with a strong emphasis on collaboration and scaling research innovations. | Post-trainData | 9 |
| Security Labs Engineer This role focuses on executing security R&D projects end-to-end, building novel security infrastructure, and driving successful experiments toward production scale. It involves working with research teams to test security controls, evaluating new security technologies, and documenting results to inform future security architecture. The role spans from initial project scoping to potential production deployment, with a focus on high-assurance environments and AI-assisted security tooling. | ServeShip | 9 |
| Research Lead, Training Insights Research Lead focused on developing and executing strategies for measuring and characterizing model capabilities across training and deployment. This role involves driving original research into new evaluation methodologies, leading a team, and spanning the full lifecycle of model development, from pretraining to deployment. The work includes creating long-horizon evaluations, measuring emerging capabilities, and understanding their development during RL training and post-training. The role also involves cross-organizational collaboration to map evaluation landscapes and identify gaps, shaping the evaluation narrative for model releases, and contributing to the broader research community. | Eval GatePost-train | 9 |
| Research Engineer, AI Observability Research Engineer focused on designing and building AI-based monitoring systems to analyze large unstructured datasets, produce structured insights, and develop agentic integrations for investigation and action. The role involves working across the full stack, from core analysis frameworks to user-facing applications, with a direct impact on measuring and mitigating AI misuse and misalignment. This role is critical for scaling human oversight of AI systems. | Eval GateAgent | 9 |
| Research Engineer, Environment Scaling This role focuses on improving the intelligence of public models by building and managing RL training environments. It involves identifying tasks, designing reward signals, managing external data vendors, and evaluating model performance, combining ML research, data operations, and project management. | DataPost-train | 9 |
| Prompt Engineer, Agent Prompts & Evals This role focuses on prompt engineering and evaluation development for AI-first products and features, bridging model capabilities with user experience. It involves designing, testing, and optimizing prompts, building evaluation suites, supporting model launches, and contributing to prompt development frameworks. The role requires strong software engineering skills, LLM and prompt engineering experience, and understanding of evaluation methodologies. | AgentEval Gate | 9 |
| Research Scientist, Frontier Red Team (Emerging Risks) Research Scientist focused on understanding and defending against societal risks from advanced AI models, particularly self-improving and autonomous systems. The role involves designing research experiments, building evals, and producing artifacts to communicate model capabilities and inform product/safeguards decisions. Emphasis on emerging risks from AI integration into the economy and society. | Eval GateAgent | 9 |
| Model Quality Software Engineer, Claude Code Staff Software Engineer to set technical direction at the intersection of engineering and research on the Claude Code team. Architect systems, tooling, and evaluation infrastructure to measure, understand, and improve Claude's coding capabilities. Drive architecture, mentor engineers, and influence the direction of Claude Code. | Eval GateAgent | 9 |
| Research Engineer / Scientist, Frontier Red Team (Cyber) Research Engineer/Scientist focused on AI-enabled cybersecurity, developing tools and frameworks for autonomous vulnerability discovery, remediation, malware detection, and pentesting. Designs and runs experiments to evaluate AI cyber capabilities and builds infrastructure for AI systems operating in security environments. Translates findings into demonstrations for policymakers and collaborates with external experts. Senior candidates will set research strategy and own the technical roadmap. | AgentEval Gate | 9 |
| Applied AI Engineer, Startups Applied AI Engineer role focused on advising and partnering with AI-native startups to build on the Claude Developer Platform. Responsibilities include technical guidance, developing evaluation frameworks, designing scalable architectures, and creating technical resources to help startups succeed with Claude. Requires production experience with LLM-powered applications, agent architectures, and evaluation frameworks. | Agent | 9 |
| Research Engineer / Research Scientist, Vision Research Engineer/Scientist focused on vision and spatial reasoning for LLMs, working on pretraining, RL, and runtime techniques like agentic harnesses. Involves developing and evaluating multimodal capabilities, creating benchmarks, and partnering with product teams to improve Claude models. | Post-trainAgent | 9 |
| Research Engineer/Research Scientist, Audio Research Engineer/Scientist focused on audio AI, working on training audio models, developing novel architectures, and optimizing inference for speech and audio understanding and generation systems. | Post-trainServe | 9 |
| Research Engineer, Universes Research Engineer role focused on building next-generation agentic environments for training AI models. This role involves implementing novel approaches, contributing to research direction, designing training environments and methodologies, and building evaluations for capable and safe agentic AI. It blends research and engineering, with a focus on reinforcement learning and complex, long-horizon agentic tasks. | AgentPost-train | 9 |
| Senior Research Scientist, Reward Models Senior Research Scientist focused on reward models for LLMs, involving novel architectures, RLHF, LLM-based evaluation, and mitigating reward hacking. Aims to improve model alignment with human values and translate research into production systems. | Post-trainEval Gate | 9 |
| Research Engineer, Reward Models Platform Research Engineer focused on building platforms and infrastructure to automate and accelerate the reward model development and evaluation workflows for ML researchers at Anthropic. The role involves creating tools for rubric development, human feedback analysis, reward robustness evaluation, and detecting reward hacks, with the goal of enabling rapid iteration and improving reward signal quality for training AI models. | Post-train | 9 |
| Anthropic Fellows Program — AI Security This is a research fellowship program focused on AI safety and security, aiming to produce public outputs like paper submissions. Fellows will use external infrastructure and open-source models, working on empirical projects with mentorship from Anthropic researchers. | Post-train | 9 |
| Anthropic Fellows Program Anthropic's Fellows Program offers a 4-month full-time research opportunity focused on AI safety and related areas. Fellows will use external infrastructure and open-source models to conduct empirical projects, aiming for public outputs like paper submissions, with mentorship from Anthropic researchers. The program is designed to foster AI research and engineering talent, regardless of previous experience, and emphasizes safety, interpretability, and steerability of AI systems. | Pretrain | 9 |
| Research Engineer, Cybersecurity Reinforcement Learning Research Engineer role focused on applying reinforcement learning to cybersecurity tasks like secure coding and vulnerability remediation, blending research and engineering to train safe AI models. Requires cybersecurity domain expertise and ML/software engineering skills. | Post-trainData | 9 |
| Research Engineer, Interpretability Research Engineer focused on building and maintaining specialized infrastructure for interpretability research in AI systems. This involves developing tools for model analysis, optimizing training and inference pipelines, and ensuring reliability for safety audits, with a strong emphasis on understanding and controlling model behavior. | Post-trainServe | 9 |
| Research Engineer, Virtual Collaborator (Cowork) Research Engineer focused on training Claude for virtual collaborator workflows, involving RL environments, data creation, and evaluation systems for enterprise use cases. | Post-trainData | 9 |
| Machine Learning Systems Engineer, RL Engineering ML Systems Engineer focused on Reinforcement Learning Engineering to build, maintain, and improve the algorithms and infrastructure for training AI models like Claude using RLHF and other advanced techniques. The role emphasizes improving system performance, robustness, and usability to accelerate research breakthroughs in AI capabilities and safety. | Post-train | 9 |
| Machine Learning Systems Engineer, Research Tools Machine Learning Systems Engineer focused on developing and optimizing encodings and tokenization systems for Anthropic's Finetuning workflows. This role acts as a bridge between Pretraining and Finetuning teams, building infrastructure crucial for model learning and data interpretation, impacting research progress and efficiency. | DataPost-train | 9 |
| Research Engineer / Research Scientist, Tokens Research Engineer/Scientist role focused on building large-scale ML systems, touching all parts of code and infrastructure, from cluster reliability and job efficiency to running scientific experiments and improving dev tooling. The role involves optimizing ML systems, comparing model variants, scaling training jobs, and designing fault tolerance strategies, with a focus on safe, steerable, and trustworthy AI. | PretrainServe | 9 |
| ML/Research Engineer, Safeguards ML/Research Engineer focused on detecting and mitigating misuse of AI systems, building classifiers, monitoring for harms, evaluating agentic product safety, and conducting research on red-teaming and adversarial robustness. | AgentData | 9 |
| Privacy Research Engineer, Safeguards Research Engineer focused on privacy for large language models, developing and auditing privacy-preserving training algorithms and techniques, and ensuring responsible data handling. | DataPost-train | 9 |
| Performance Engineer, GPU This role focuses on optimizing GPU performance and systems engineering for large language models, specifically improving utilization and efficiency for inference and training at scale. It involves deep work in GPU programming, custom kernel development, and distributed systems. | ServePretrain | 9 |
| ML Infrastructure Engineer, Safeguards ML Infrastructure Engineer focused on building and scaling critical infrastructure for AI safety systems, including real-time and batch classifier/safety evaluations, monitoring, and optimizing inference for safety-critical applications. | Eval GateServe | 9 |