Which companies are hiring for Evals roles?

The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).

What AI lifecycle stage does Evals belong to?

Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).

What sectors invest most in Evals?

The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.

← Tag co-occurrence network

Evals

Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.

Primary AI lifecycle stage: evaluation.

As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.

Top hiring:

57 AI roles tagged evals.

Company	Title	Sector	AI score	Other tags
Target	Principal AI Engineer - Advanced AI (Machine Learning, Python, Deep Learning)	Retail	9	Agent orchestration · LLM observability · Model serving · Inference infra
Walmart	Senior Data Scientist: Associate AI Experience	Retail	9	Agent orchestration · Tool use · RAG · Vector DB · Fine-tuning · Model serving · LLM observability
Walmart	Principal Data Scientist: Associate AI experience	Retail	9	Agent orchestration · Tool use · RAG · Vector DB · Fine-tuning · Model serving · Multi-agent
Walmart	Distinguished Data Scientist: Associate AI Experience	Retail	9	Agent orchestration · Tool use · RAG · Vector DB · Fine-tuning · Model serving · Multi-agent
Walmart	Principal, Data Scientist	Retail	9	Agent orchestration · Inference infra · Model serving · RAG · Vector DB · LLM observability · Guardrails
Walmart	(USA) Distinguished, Data Scientist	Retail	9	Agent orchestration · Agent research · Tool use · RAG · Vector DB · Fine-tuning · Model serving · Multimodal
Walmart	Staff, Data Scientist – Conversational AI	Retail	9	Agent orchestration · Tool use · Guardrails · RAG · Fine-tuning · Model serving · LLM observability
Walmart	Principal, Data Scientist	Retail	9	Agent orchestration · Tool use · RAG · Vector DB · Model serving · Inference infra
Walmart	(USA) Staff, Data Scientist	Retail	9	Agent orchestration · Tool use · RAG · Agent research
Target	Lead Engineer- Advanced AI	Retail	8	Agent orchestration · Tool use · RAG · LLM observability · Model serving · Inference infra

Frequently asked questions

What is Evals in AI?
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
How many AI roles reference Evals right now?
2,040 active AI roles across 208 companies in our index reference Evals as of today.
Which companies are hiring for Evals roles?
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
What AI lifecycle stage does Evals belong to?
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
What sectors invest most in Evals?
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.