AI Frontier · AI lab
Currently tracking 124 active AI roles, with 106 new openings in the last 4 weeks. Primary focus: Agent · Engineering. Salary range $46k–$850k (avg $405k).
Anthropic currently has 132 active AI-related roles in our index. The most common open titles are: Applied AI Architect, Industries (2), Regional Research Economist, Economic Research (2), Research Engineer, Machine Learning (RL Velocity) (2), Research Engineer, Production Model Post-Training (2), Staff Software Engineer, AI Reliability Engineering (2). Most positions are in Engineering and Research.
Anthropic's active AI hiring is concentrated in: agents (28%), serving infrastructure (17%), post-training (14%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Anthropic is hiring AI talent in: United States (106 roles), United Kingdom (20 roles), Canada (6 roles), Ireland (5 roles).
Job postings at Anthropic most frequently reference: model serving, evals, llm observability, agent orchestration, inference infra.
In the past 30 days, Anthropic has posted 29 new AI-related roles. That is a +61% change versus the prior 30 days (18 → 29).
Anthropic has 145 active AI-related job listings. The majority of these roles are focused on agents, comprising 28% of the total. Engineering is the most frequent function, with 74 listings, followed by Research with 51. The company is primarily hiring in the United States, with 118 positions, and the United Kingdom, with 22. Frequent tech tags include model_serving, evals, and agent_orchestration, suggesting a focus on deployment and evaluation of AI systems. In the last 30 days, Anthropic posted 16 new AI roles, a 47% decrease compared to the previous 30-day period.
| Title | Stage | AI score |
|---|---|---|
| Research Engineer, Science of Scaling Research Engineer/Scientist on the Science of Scaling team, focused on developing next-generation large language models. The role involves research at the intersection of cutting-edge research and practical engineering, contributing to safe, steerable, and trustworthy AI systems. Responsibilities include research into the science of converting compute into intelligence, leading research projects, designing and analyzing experiments, optimizing training infrastructure, and developing dev tooling. Requires significant software engineering experience, proficiency in Python and deep learning frameworks, and a results-oriented approach. Strong candidates may have experience with JAX, reinforcement learning, high-performance ML systems, accelerators, Kubernetes, OS internals, transformer architectures, large-scale ETL, and distributed training at scale. | Pretrain | 10 |
| Research Engineer / Research Scientist, Pre-training |
| Pretrain |
| 10 |
| Research Engineer, Pretraining Research Engineer focused on pretraining large language models, involving research into model architecture, algorithms, data processing, and optimizers, along with scaling training infrastructure and analyzing experiments. The role contributes to the entire stack from low-level optimizations to high-level model design. | Pretrain | 10 |
| Research Engineer, Machine Learning (Reinforcement Learning) Research Engineer focused on Reinforcement Learning to advance capabilities and safety of large language models. This role involves implementing novel approaches, contributing to research direction, and creating agentic models for tasks like computer use and autonomous software generation, while also improving reasoning abilities and developing prototypes. Key responsibilities include architecting RL infrastructure, designing training environments and methodologies, driving performance improvements, and collaborating across teams. | AgentPost-train | 10 |
| Research Engineer, Frontier Red Team (Autonomy) Research Engineer focused on building and evaluating autonomous AI systems and defensive agents to counter adversarial AI, with a focus on cyberphysical risks and AI safety. This role involves creating model organisms, developing defensive agents, and translating findings into policy-relevant demonstrations. | AgentEval Gate | 10 |
| Research Scientist, Interpretability Research Scientist focused on mechanistic interpretability of LLMs, aiming to understand how trained models work by reverse-engineering their parameters and algorithms. The role involves developing methods, designing experiments, creating interpretability features, building infrastructure, and collaborating with other teams. Requires strong scientific research background with some interpretability work, comfort with experimental science, and proficiency in Python. | Post-train | 10 |
| [Expression of Interest] Research Manager, Interpretability Research Manager for the Interpretability team, focusing on mechanistic interpretability to understand how large language models work internally and ensure AI safety. The role involves partnering with a research lead on direction, project planning, execution, hiring, and people development, translating research ideas into tangible goals, and overseeing their execution. This is a management role, distinct from individual contributor research scientist or engineer roles. | Post-train | 10 |
| Research Engineer, Pretraining Scaling Research Engineer focused on training production pretrained models at scale, involving performance optimization, debugging, experimental design, and incident response during model launches. The role bridges research and engineering, working across the full training stack. | Pretrain | 10 |
| Anthropic AI Safety Fellow, Canada This is a fellowship program focused on AI safety research, aiming to bridge industry engineering expertise with research skills. Fellows will work on empirical projects using external infrastructure, with the goal of producing public outputs like paper submissions. The program offers mentorship, funding, and compute resources. | Post-train | 10 |
| Anthropic AI Safety Fellow, UK This is a research fellowship focused on AI safety, aiming to produce empirical research outputs like paper submissions. Fellows will use external infrastructure and work on projects aligned with Anthropic's research priorities, receiving mentorship and resources. | Pretrain | 10 |
| Anthropic AI Safety Fellow, US This is a fellowship program focused on AI safety research, aiming to bridge industry engineering expertise with research skills. Fellows will work on empirical projects using external infrastructure, with the goal of producing public outputs like paper submissions. The program offers mentorship, funding, and compute resources. | Post-train | 10 |
| Research Engineer/Research Scientist, Pre-training Research Engineer/Scientist focused on pre-training large language models, involving research in model architecture, algorithms, data processing, and optimizer development, as well as optimizing and scaling training infrastructure. | Pretrain | 10 |
| Research Engineer / Research Scientist, Pre-training Research Engineer/Scientist focused on pre-training large language models, with an emphasis on multimodal capabilities. The role involves research, implementation, experiment design, and optimizing training infrastructure for next-generation AI systems. | Pretrain | 10 |
| Staff Research Engineer, Discovery Team Staff Research Engineer focused on building AI systems capable of scientific discovery and long-horizon reasoning, working across the full model stack from training to inference and agentic systems. | PretrainAgent | 10 |
| Research Engineer, Machine Learning (Reinforcement Learning) Research Engineer focused on Reinforcement Learning to advance capabilities and safety of large language models. This role involves implementing novel approaches, contributing to research direction, creating agentic models via tool use for tasks like computer use and autonomous software generation, and improving reasoning abilities. Projects include architecting RL infrastructure, designing training environments and evaluations for RL agents, driving performance improvements, and developing automated testing frameworks. | Post-trainAgent | 10 |
| Research Scientist / Research Engineer, Pre-training Research Engineer role focused on pre-training large language models, involving research into model architecture, algorithms, data processing, and optimizer development, alongside optimizing and scaling training infrastructure. Requires advanced degree, strong software engineering skills, and expertise in Python and deep learning frameworks. | Pretrain | 10 |
| Research Engineer, Interpretability Research Engineer focused on mechanistic interpretability to understand and improve the safety of large language models. This involves implementing and analyzing experiments, optimizing research workflows, building tools for experimentation, and developing infrastructure to support model safety improvements. | Post-train | 10 |
| Research Manager, Interpretability Manager for the Interpretability team focused on mechanistic interpretability of large language models, aiming to understand how they work internally for AI safety. | Post-train | 10 |
| Research Scientist, Interpretability Research Scientist focused on mechanistic interpretability of LLMs, aiming to understand how neural network parameters map to algorithms for safety and steerability. Involves developing methods, running experiments, building infrastructure, and communicating results. | Post-train | 10 |
| Research Engineer, Domain Scaling Research Engineer focused on scaling AI models for real-world knowledge work in domains like finance, healthcare, and legal. This role involves owning the end-to-end data strategy, from sourcing tasks to RL training, including designing reward signals, managing external data vendors, and developing QA frameworks to ensure environment quality and prevent reward hacking. It combines applied research with hands-on data work. | DataPost-train | 9 |
| Staff+ Software Engineer, Inference Runtime Staff+ Software Engineer for Anthropic's Inference Runtime team, focusing on the accelerator-agnostic core of their AI inference serving stack. The role involves setting technical direction, owning the architecture and roadmap, hands-on coding in Rust/Python, optimizing accelerator usage, and building validation systems. Requires deep systems engineering or ML infrastructure background with experience in performance optimization and large-scale distributed systems. | Serve | 9 |
| Research Engineer, Code RL (Reinforcement Learning) Research Engineer focused on Reinforcement Learning for code generation, aiming to improve models' ability to write, edit, test, debug, and ship software. This role involves designing RL environments, building reward signals, running training experiments, and improving pipeline efficiency, blending research with engineering implementation. | Post-trainAgent | 9 |
| Software Engineer, Safeguards Evals Software Engineer role focused on building and owning the evaluation infrastructure for an agentic investigation system. This involves designing experiments, constructing high-quality eval datasets, measuring agent performance, analyzing coverage gaps, and productionizing research into release pipelines. The role also involves building tooling for policy experts and constructing RL environments to improve safety investigation capabilities. | AgentEval Gate | 9 |
| Product Manager, Claude Code Model Performance Product Manager for Anthropic's Claude Code Model Performance team, responsible for driving model launches, building agentic evals, and translating research improvements into developer-facing outcomes. Requires experience building agentic evals, a systems thinking approach, and comfort with both research and engineering. | AgentEval Gate | 9 |
| Research Scientist, Life Sciences Research Scientist role focused on improving AI model capabilities for life sciences tasks. This involves building agentic tools, designing evaluation benchmarks, and applying post-training techniques to enhance model performance on scientific workflows like bioinformatics, database queries, and literature synthesis. The role bridges ML, software engineering, and biology to make AI a better research assistant in life sciences. | Post-trainAgent | 9 |
| Technical Program Manager, Discovery Technical Program Manager on the Discovery team, owning systems and programs that determine research velocity, including compute planning, scientific RL environment health, and vendor pipelines. Requires ML engineering or research background with program leadership experience, technical depth to debug pipelines and analyze RL transcripts, and organizational effectiveness to coordinate across research, infrastructure, product, and data operations. | Post-trainData | 9 |
| Research Engineer, Search and Knowledge Post-Training Research Engineer focused on advancing search and knowledge capabilities in LLMs through post-training techniques. The role involves defining research hypotheses, designing experiments, building instrumentation for controlled studies, developing evaluations to distinguish reasoning from pattern matching, and driving optimization rigor. It sits at the intersection of RL, retrieval, and evaluation, aiming to make LLMs trustworthy searchers. | Post-trainAgent | 9 |
| Technical Program Manager, Research This role is a Technical Program Manager for Anthropic's research organization. The TPM will define and build programs for research teams, focusing on areas like compute, evals, and RL environments. They will drive end-to-end execution of complex research initiatives, establish processes, and ensure operational health of RL environments. The role requires a background in ML research or engineering, experience building technical programs from scratch, and the ability to navigate ambiguity in fast-moving research environments. | Post-trainData | 9 |
| Research Engineer, Model Evaluations Research Engineer focused on building and operating the evaluation infrastructure for large language models, ensuring their capabilities, knowledge, and safety properties are rigorously measured and validated at scale. This role involves designing evaluations, building distributed systems for running them, monitoring model health during training, and partnering with researchers to interpret results. | Eval GatePost-train | 9 |
| Research Engineer, RL Infrastructure (Knowledge Work) Research Engineer focused on the reliability, observability, and infrastructure of training environments and evaluation systems for AI models, ensuring stability and quality as they scale. The role involves proactive hardening, building tooling for early problem detection, and serving as a dedicated owner for environment health and evaluation integrity. | Eval GateData | 9 |
| Research Engineer, Machine Learning (RL Velocity) The RL Velocity team owns the efficiency and reliability of the RL Science stack, building and improving the core platform for RL training runs to remove bottlenecks and enable faster iteration. This role focuses on ML infrastructure, distributed systems, and research tooling to improve the velocity and reliability of RL training at scale. | DataPost-train | 9 |
| Research Engineer, Machine Learning (RL Velocity) Research Engineer focused on building and improving the RL training infrastructure and tooling at Anthropic. The role involves identifying and removing bottlenecks in the RL stack, partnering with researchers and other engineering teams, and owning the reliability and performance of research runs to enable faster iteration and shipping of better models at scale. | DataPost-train | 9 |
| Research Engineer, Safeguards Labs Research Engineer focused on AI safety, investigating novel methods for detecting misuse, strengthening model safeguards, and building evaluation methodologies for AI systems, particularly in agentic workflows. The role involves leading research projects, designing offline analyses, developing prototypes, and collaborating with production teams. | Eval GatePost-train | 9 |
| Anthropic STEM Fellow This role is for a STEM Fellow to work alongside Anthropic's research teams for a few months. Fellows will use their domain expertise to evaluate, improve, and apply Claude's capabilities in their field. This involves designing evaluations, identifying data/techniques for capability gaps, and applying Claude to open problems using various strategies and tools. Projects are scoped to ship within the fellowship period. | Eval GateAgent | 9 |
| Manager of Forward Deployed Engineering Manager of Forward Deployed Engineering at Anthropic, responsible for leading a team that embeds with strategic customers to ship production AI applications and agent deployments built on Claude. This player-coach role involves hiring, developing, and mentoring engineers, overseeing customer engagements, reviewing technical architectures, and collaborating with cross-functional teams to drive AI transformation for enterprise clients. | Agent | 9 |
| Anthropic Fellows Program — Reinforcement Learning This is a research fellowship program focused on Reinforcement Learning (RL) within AI safety. Fellows will work on empirical projects, potentially using external infrastructure, with the goal of producing public outputs like paper submissions. The program emphasizes mentorship from Anthropic researchers and provides a stipend and compute funding. Key activities include building model-based tools for data quality, understanding generalization, and creating RL environments for capabilities and safety tasks. | Post-train | 9 |
| Anthropic Fellows Program — AI Safety This is a research fellowship program focused on AI safety, aiming to foster talent in empirical AI research. Fellows will work on projects aligned with Anthropic's research priorities, using external infrastructure and external models, with the goal of producing public outputs like paper submissions. Key research areas include Scalable Oversight, Adversarial Robustness and AI Control, Model Organisms, Model Internals / Mechanistic Interpretability, and AI Welfare. | Post-train | 9 |
| Research Engineer, Performance RL Research Engineer focused on Reinforcement Learning for code generation and accelerator performance, aiming to improve model reasoning and coding capabilities. The role involves inventing RL environments, conducting experiments, shaping research roadmaps, and delivering work into training runs, with a strong emphasis on collaboration and scaling research innovations. | Post-trainData | 9 |
| Security Labs Engineer This role focuses on executing security R&D projects end-to-end, building novel security infrastructure, and driving successful experiments toward production scale. It involves working with research teams to test security controls, evaluating new security technologies, and documenting results to inform future security architecture. The role spans from initial project scoping to potential production deployment, with a focus on high-assurance environments and AI-assisted security tooling. | ServeShip | 9 |
| Research Lead, Training Insights Research Lead focused on developing and executing strategies for measuring and characterizing model capabilities across training and deployment. This role involves driving original research into new evaluation methodologies, leading a team, and spanning the full lifecycle of model development, from pretraining to deployment. The work includes creating long-horizon evaluations, measuring emerging capabilities, and understanding their development during RL training and post-training. The role also involves cross-organizational collaboration to map evaluation landscapes and identify gaps, shaping the evaluation narrative for model releases, and contributing to the broader research community. | Eval GatePost-train | 9 |
| Research Engineer, AI Observability Research Engineer focused on designing and building AI-based monitoring systems to analyze large unstructured datasets, produce structured insights, and develop agentic integrations for investigation and action. The role involves working across the full stack, from core analysis frameworks to user-facing applications, with a direct impact on measuring and mitigating AI misuse and misalignment. This role is critical for scaling human oversight of AI systems. | Eval GateAgent | 9 |
| Forward Deployed Engineer The Forward Deployed Engineer (FDE) role at Anthropic focuses on embedding with strategic customers to drive the adoption of advanced AI applications. This role involves building production applications using Claude models within customer systems, delivering technical artifacts like sub-agents and agent skills, and providing deployment support. The FDE will work closely with post-sales, product, and engineering teams, combining engineering expertise with customer-facing skills to solve complex business challenges and represent Anthropic's mission. | Agent | 9 |
| Research Engineer, Environment Scaling This role focuses on improving the intelligence of public models by building and managing RL training environments. It involves identifying tasks, designing reward signals, managing external data vendors, and evaluating model performance, combining ML research, data operations, and project management. | DataPost-train | 9 |
| Research Engineer, Production Model Post-Training Research Engineer focused on post-training of production LLMs, implementing and optimizing techniques like Constitutional AI and RLHF to enhance model capabilities, alignment, and safety. Involves research, pipeline development, evaluation, and debugging at scale. | Post-train | 9 |
| Prompt Engineer, Agent Prompts & Evals This role focuses on prompt engineering and evaluation development for AI-first products and features, bridging model capabilities with user experience. It involves designing, testing, and optimizing prompts, building evaluation suites, supporting model launches, and contributing to prompt development frameworks. The role requires strong software engineering skills, LLM and prompt engineering experience, and understanding of evaluation methodologies. | AgentEval Gate | 9 |
| Research Scientist, Frontier Red Team (Emerging Risks) Research Scientist focused on understanding and defending against societal risks from advanced AI models, particularly self-improving and autonomous systems. The role involves designing research experiments, building evals, and producing artifacts to communicate model capabilities and inform product/safeguards decisions. Emphasis on emerging risks from AI integration into the economy and society. | Eval GateAgent | 9 |
| Model Quality Software Engineer, Claude Code Staff Software Engineer to set technical direction at the intersection of engineering and research on the Claude Code team. Architect systems, tooling, and evaluation infrastructure to measure, understand, and improve Claude's coding capabilities. Drive architecture, mentor engineers, and influence the direction of Claude Code. | Eval GateAgent | 9 |
| Research Engineer / Scientist, Frontier Red Team (Cyber) Research Engineer/Scientist focused on AI-enabled cybersecurity, developing tools and frameworks for autonomous vulnerability discovery, remediation, malware detection, and pentesting. Designs and runs experiments to evaluate AI cyber capabilities and builds infrastructure for AI systems operating in security environments. Translates findings into demonstrations for policymakers and collaborates with external experts. Senior candidates will set research strategy and own the technical roadmap. | AgentEval Gate | 9 |
| Research Engineer, Frontier Red Team (Hardware Lead) Research Engineer focused on leading hardware research for frontier AI safety, specifically interfacing LLMs with robotics and cyberphysical systems. The role involves designing and building systems, developing evaluations, creating training environments, and demonstrating capabilities to inform policy and build defenses against advanced AI risks. | AgentEval Gate | 9 |
| Applied AI Engineer, Startups Applied AI Engineer role focused on advising and partnering with AI-native startups to build on the Claude Developer Platform. Responsibilities include technical guidance, developing evaluation frameworks, designing scalable architectures, and creating technical resources to help startups succeed with Claude. Requires production experience with LLM-powered applications, agent architectures, and evaluation frameworks. | Agent | 9 |