Which companies are hiring for Evals roles?

The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).

What AI lifecycle stage does Evals belong to?

Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).

What sectors invest most in Evals?

The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.

← Tag co-occurrence network

Evals

Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.

Primary AI lifecycle stage: evaluation.

As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.

Top hiring:

42 AI roles tagged evals.

Company	Title	Sector	AI score	Other tags
Disney	Lead Machine Learning Engineer	Media	9	Agent orchestration · Agent research · Multimodal · RAG · LLM observability · Guardrails · Model serving · Inference infra
Warner Bros Discovery	Manager, Machine Learning Engineering	Media	8	Model serving
Disney	Lead Product Manager, AI Platform	Media	8	Agent orchestration · RAG · LLM observability · Model serving
Disney	Director, Decision Science AI/ML Engineering & Ops	Media	8	Model serving · Inference infra · LLM observability · Guardrails
Comcast	Engineer 4 - Machine Learning	Media	8	Agent orchestration · LLM observability · Model serving · Fine-tuning · Guardrails
Disney	Omni-Channel Analytics Mgr	Media	8	Agent orchestration · Agent research · Fine-tuning · Guardrails · LLM observability
Disney	Manager - Applied AI	Media	8	Agent orchestration · Tool use · RAG · LLM observability · Model serving
Disney	Staff GenAI/ML Engineer (Emerging Tech & AI Automation) Project Hire	Media	8	Agent orchestration · RAG · Fine-tuning · Model serving · Vector DB · LLM observability
The Trade Desk	Staff Product Manager, Agentic AI	Media	8	Agent orchestration · LLM observability · Guardrails
Comcast	Software Engineer - Agentic AI	Media	8	Agent orchestration · Tool use · LLM observability · RAG · Agent research
Comcast	Comcast AI Research Intern	Media	8	Fine-tuning · RL post-training · Synthetic data · Agent research
Comcast	Principal Engineer - Agentic AI	Media	8	Agent orchestration · Agent research · LLM observability · Tool use
Disney	Sr Data Scientist	Media	8	Multimodal · Fine-tuning · RAG · Inference infra · Model serving · Vector DB
Comcast	Software Engineering Manager, AI Agents	Media	8	Agent orchestration · Tool use · Guardrails · LLM observability · Model serving
Comcast	Engineer 3, Agentic AI	Media	8	Agent orchestration · Tool use · LLM observability · Agent research · Code gen
Disney	Staff GenAI/ML Engineer (Emerging Tech & AI Automation) Project Hire	Media	8	Agent orchestration · RAG · Vector DB · Fine-tuning · Model serving · LLM observability
Comcast	Machine Learning Engineer 4	Media	8	Agent orchestration · LLM observability · Guardrails · Model serving · Inference infra
Comcast	Sr. Software Engineer - Agentic AI	Media	8	Agent orchestration · Agent research · LLM observability · RAG
Comcast	Agent Evaluation Engineer	Media	8	Agent orchestration · LLM observability · Guardrails
Warner Bros Discovery	Sr. Staff, Data Science & Applied AI	Media	8	Agent orchestration · RAG · Guardrails · LLM observability · Model serving
Disney	Lead Data Scientist, Ad Research	Media	8	Agent orchestration · Multimodal · Vision
Disney	Senior Machine Learning Engineer, Ad Platforms	Media	8	Agent orchestration · Multimodal · Fine-tuning · Model serving · Audio & speech
Disney	Lead Machine Learning Engineer, Ad Platforms	Media	8	Recommender systems · Search & ranking · Fine-tuning · RAG · LLM observability · Multimodal · Vision
Disney	VP, Analytics Engineering & DnA Operations	Media	7	Agent orchestration · Guardrails · LLM observability
Comcast	Development Engineer in Test (SDET) – ML & LLM Systems	Media	7	LLM observability · Fine-tuning · Model serving
Comcast	Software Development Engineer in Test (SDET) – ML & LLM Systems	Media	7	LLM observability · Fine-tuning · Model serving
Comcast	Quality Engineering Lead – Agent Evaluation & AI Platforms	Media	7	Agent orchestration · Guardrails · LLM observability
Comcast	Agentic AI Test Engineer	Media	7	Agent orchestration · LLM observability
Comcast	Sr. Python Engineer, Agentic AI	Media	7	Agent orchestration · Tool use · Guardrails · LLM observability
Warner Bros Discovery	Manager, Machine Learning Engineer	Media	7	Model serving

Frequently asked questions

What is Evals in AI?
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
How many AI roles reference Evals right now?
2,040 active AI roles across 208 companies in our index reference Evals as of today.
Which companies are hiring for Evals roles?
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
What AI lifecycle stage does Evals belong to?
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
What sectors invest most in Evals?
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.