Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.
Primary AI lifecycle stage: evaluation.
As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
2,040 active AI roles across 208 companies in our index reference Evals as of today.
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.
12 AI roles tagged evals.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Smartsheet | Sr Principal Data Scientist | Seattle | 8 | Agent orchestration · Tool use · Guardrails · RAG · Fine-tuning · Model serving · Recommender systems · LLM observability |
| Smartsheet | Senior Software Engineer II - Applied AI (Remote Eligible) | Seattle | 8 | RAG · LLM observability · Agent orchestration · Model serving |
| Smartsheet | Senior Forward Deployed AI Engineer (Remote Eligible in the UK) | Seattle | 8 | Agent orchestration · Tool use · RAG |
| Smartsheet | Senior Forward Deployed AI Engineer (Remote Eligible in Germany) | Seattle | 8 | Agent orchestration · Tool use · RAG · LLM observability |
| Smartsheet | Senior Forward Deployed AI Engineer (Remote Eligible) | Seattle | 8 | Agent orchestration · Tool use · RAG |
| Qualtrics | Senior Software Engineer - Experience Agents | Seattle | 7 | Agent orchestration · LLM observability · Model serving · RAG |
| Smartsheet | Senior Security Engineer II, Application Security (Remote Eligible) | Seattle | 7 | Agent orchestration · Guardrails · LLM observability |
| Smartsheet | Senior Product Manager - Applied AI (Remote Eligible) | Seattle | 7 | Agent orchestration · RAG |
| Smartsheet | Senior Software Engineer II - Applied AI and Evaluations (Remote Eligible) | Seattle | 7 | LLM observability · RAG · Agent orchestration · Fine-tuning |
| Smartsheet | Senior Manager, Engineering - Observability Platform (Remote Eligible) | Seattle | 5 | LLM observability · Agent orchestration · Model serving |
| Redfin | Director, Quality Engineering | Seattle | 5 | Agent orchestration · Guardrails · LLM observability · Tool use |
| Smartsheet | Senior AI & Data Governance Engineer-II (Hybrid in Bangalore ) | Seattle | 5 | Guardrails · LLM observability · RAG |