Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
2,040 active AI roles across 208 companies in our index reference Evals as of today.
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.
Primary AI lifecycle stage: evaluation.
As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.
578 AI roles tagged evals.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Adobe | Staff Agentic ML Engineer - Photoshop | Enterprise | 9 | Agent orchestration · Tool use · Fine-tuning · Model serving · Multimodal · Vision · RL post-training · Reward modeling |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking · Model serving |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking · Model serving |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking · Model serving |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking · Model serving |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking · Model serving |
| Rubrik | Senior Machine Learning Engineer | Enterprise | 9 | Fine-tuning · RL post-training · Model serving · Inference infra · Synthetic data · Guardrails · LLM observability |
| CrowdStrike | Data Scientist, Agentic Systems (Remote) | Enterprise | 9 | Agent orchestration · Agent research · Tool use · Guardrails · Fine-tuning · RL post-training · LLM observability · Model serving |
| CrowdStrike | AI Engineer(Remote, IND) | Enterprise | 9 | Agent orchestration · Agent research · RAG · Vector DB · Fine-tuning · Model serving · LLM observability · Guardrails · Tool use |
| CrowdStrike | Lead AI Engineer, GTM Applications (Remote) | Enterprise | 9 | Agent orchestration · Agent research · RAG · Vector DB · LLM observability · Guardrails · Model serving · Inference infra |
| Oracle | Senior Principal AI Agent / ML Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving |
| Oracle | Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving |
| CrowdStrike | Director, Model Post-Training and Agentic Research (Remote) | Enterprise | 9 | RL post-training · Agent orchestration · Tool use · RLHF · Reward modeling · Agent research |
| CrowdStrike | Director, AI Alignment and Interpretability (Remote) | Enterprise | 9 | Interpretability · LLM observability · Guardrails |
| Axon | Senior Agentic AI Research Scientist | Enterprise | 9 | Agent orchestration · Multimodal · Vision · RAG · Vector DB · Inference infra · Model serving |
| Oracle | Applied Scientist 3 | Enterprise | 9 | Synthetic data · Multimodal · Fine-tuning · LLM observability |
| CrowdStrike | Sr. AI/LLM Threat Researcher, Agentic Systems - AI Detection and Response (Hybrid) | Enterprise | 9 | Agent orchestration · Agent research · Guardrails · LLM observability · RAG · Tool use |
| Salesforce | Principal Architect | Enterprise | 9 | Agent orchestration · Multi-agent · Guardrails · LLM observability · RAG · Fine-tuning · Model serving · Agent research |
| Elastic | Principal Data Scientist - Agent Builder | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Model serving · Search & ranking · Vector DB |
| Oracle | Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research |
| Oracle | Senior Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research |
| Oracle | Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Inference infra · Model serving |
| Oracle | Senior Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving · Agent research |
| Oracle | Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research |
| Oracle | Principal AI Agent / ML Software Engineer (OCI) | Enterprise | 9 | Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving |
| Okta | Staff Product Security Engineer | Enterprise | 9 | Agent orchestration · Agent research · Guardrails · LLM observability · Tool use |
| Adobe | Applied Scientist 5.5 | Enterprise | 9 | Fine-tuning · Inference infra · Model serving · Multimodal |
| Box | Machine Learning Engineer III, Core Agents | Enterprise | 9 | Agent orchestration · RAG · LLM observability · Search & ranking |
| Oracle | Snr Director, Applied Science | Enterprise | 9 | Multimodal · Agent orchestration · Model serving · Inference infra · RAG · Guardrails · LLM observability · Vision · Audio & speech |
| Handshake | AI Red Teamer, CBRNE | Enterprise | 9 | Guardrails · Agent research |