Mechanistic analysis of how neural networks compute — circuit tracing, attribution, behavior decomposition. The research field underpinning safety arguments about model internals. Primary AI lifecycle stage: evaluation.
82 active AI roles across 31 companies in our index reference Interpretability as of today. New postings fell 27% in the last 30 days versus the prior 30 (30 → 22).
The companies with the most active Interpretability listings are: Microsoft (9 roles), OpenAI (8 roles), Amazon (7 roles), Meta (7 roles), Anthropic (6 roles).
Interpretability primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Interpretability roles concentrate at: post-training (51%), agents (17%).
The sectors with the most active Interpretability hiring are: Big Tech, AI Frontier, Banking.
Mechanistic analysis of how neural networks compute — circuit tracing, attribution, behavior decomposition. The research field underpinning safety arguments about model internals.
Primary AI lifecycle stage: evaluation.
As of today, 82 active AI roles across 31 companies in our index reference Interpretability. Hiring concentrates at the post-training (51%) and agents (17%) stages. Most common sectors: Big Tech, AI Frontier, Banking. New postings fell 27% in the last 30 days versus the prior 30 (30 → 22).
29 AI roles tagged interpretability.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Anthropic | Research Scientist, Interpretability | AI Frontier | 10 | LLM observability |
| Anthropic | [Expression of Interest] Research Manager, Interpretability | AI Frontier | 10 | |
| Anthropic | Anthropic AI Safety Fellow, Canada | AI Frontier | 10 | Frontier research |
| Anthropic | Anthropic AI Safety Fellow, UK | AI Frontier | 10 | Frontier research · Evals · Guardrails · RLHF |
| Anthropic | Anthropic AI Safety Fellow, US | AI Frontier | 10 | Frontier research · Evals · Guardrails · RL post-training |
| OpenAI | Researcher, Interpretability | AI Frontier | 10 | Frontier research |
| Anthropic | Research Engineer, Interpretability | AI Frontier | 10 | |
| Anthropic | Research Manager, Interpretability | AI Frontier | 10 | LLM observability |
| Anthropic | Research Scientist, Interpretability | AI Frontier | 10 | |
| OpenAI | Researcher, Alignment Training | AI Frontier | 9 | Synthetic data · RL post-training · Evals · Frontier research |
| OpenAI | Researcher, Alignment Science | AI Frontier | 9 | RL post-training · Evals · LLM observability · Guardrails |
| Anthropic | Anthropic Fellows Program — AI Safety | AI Frontier | 9 | Evals · Guardrails · RL post-training |
| OpenAI | Researcher, Safety & Privacy | AI Frontier | 9 | Evals · Guardrails · LLM observability |
| OpenAI | Threat Modeler, Preparedness | AI Frontier | 9 | Evals · Guardrails · Agent research |
| Stability AI | Multimodal Generative AI Researcher | AI Frontier | 9 | Fine-tuning · Multimodal · Vision · LLM observability · RAG · Agent research · Frontier research · Synthetic data · Model serving · Inference infra |
| OpenAI | Research Engineer, Privacy | AI Frontier | 9 | Evals |
| Lila Sciences | Scientist/Sr. Scientist, AI Safety | AI Frontier | 9 | Evals · Guardrails · Model serving · Frontier research |
| Anthropic | Anthropic Fellows Program | AI Frontier | 9 | Frontier research |
| Anthropic | Research Engineer, Interpretability | AI Frontier | 9 | Model serving · Inference infra · Fine-tuning |
| Anthropic | Privacy Research Engineer, Safeguards | AI Frontier | 9 | Fine-tuning · RL post-training · Evals |
| Character AI | Research Engineer, AI Safety & Alignment | AI Frontier | 9 | Evals · RL post-training · Fine-tuning · Guardrails · LLM observability |
| OpenAI | Technical Lead, Safety Research | AI Frontier | 9 | RL post-training · Evals · Guardrails · Frontier research |
| Anthropic | Research Engineer / Scientist, Model Welfare | AI Frontier | 9 | Evals |
| Anthropic | Research Engineer / Scientist, Alignment Science | AI Frontier | 9 | RL post-training · Evals · Agent research · Guardrails · LLM observability · Frontier research · Agent orchestration · RL robotics |
| OpenAI | Researcher, Safety Oversight | AI Frontier | 9 | Evals · Guardrails · RL post-training · Agent research |
| Anthropic | Research Engineer / Scientist, Safeguards | AI Frontier | 9 | RL post-training · Agent research · Evals · Guardrails · Agent orchestration · RL robotics |
| OpenAI | Researcher, Alignment | AI Frontier | 9 | RL post-training · Evals · Guardrails |
| Anthropic | Research Scientist, Societal Impacts | AI Frontier | 8 | Evals · Fine-tuning · Guardrails · LLM observability |
| xAI | Model Behavior Tutor - Epistemic Rigor & Truthfulness | AI Frontier | 8 | Evals · Guardrails · Agent research |