Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
2,040 active AI roles across 208 companies in our index reference Evals as of today.
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.
Primary AI lifecycle stage: evaluation.
As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.
11 AI roles tagged evals.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| AT&T | Director Cybersecurity - AI/ML/Automation (Cyber Threat Analytics) | Telecom | 8 | Agent orchestration · Tool use · Guardrails · LLM observability · Fine-tuning · Inference infra · Model serving |
| T-Mobile | Sr Engineer, Machine Learning Engineering | Telecom | 8 | Agent orchestration · RAG · Fine-tuning · Model serving · LLM observability · Multimodal |
| T-Mobile | Sr. AI Engineer | Telecom | 8 | Agent orchestration · Tool use · Fine-tuning · RAG · LLM observability · Guardrails |
| AT&T | Principal AI - Software Engineer | Telecom | 8 | Agent orchestration · RAG · Vector DB · LLM observability |
| Verizon | Spec-Product Dev/Mgt | Telecom | 7 | Agent orchestration · RAG · Fine-tuning · LLM observability · Guardrails |
| AT&T | Sr Associate Cybersecurity - AI Security | Telecom | 7 | Guardrails |
| AT&T | Lead Cybersecurity - AI Security Engineer | Telecom | 7 | Guardrails · LLM observability |
| Verizon | Distinguished Network Security Engineer | Telecom | 7 | Agent orchestration · Tool use · Guardrails · LLM observability |
| AT&T | Sr Specialist Quality/M&P/Process - AI Training Manager | Telecom | 7 | Agent orchestration · Guardrails · LLM observability · Fine-tuning |
| AT&T | Lead Technical Product Manager – Applied AI | Telecom | 7 | Agent orchestration · Tool use · LLM observability · RAG · Multimodal |
| T-Mobile | Sr Analyst, Compliance | Telecom | 5 | Agent orchestration · Guardrails |