Which companies are hiring for Evals roles?

The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).

What AI lifecycle stage does Evals belong to?

Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).

What sectors invest most in Evals?

The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.

← Tag co-occurrence network

Evals

Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.

Primary AI lifecycle stage: evaluation.

As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.

Top hiring:

Function

All Engineering · 2406 Research · 518 Product · 411

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorEnterprise×

578 AI roles tagged evals.

Company	Title	Sector	AI score	Other tags
Adobe	Staff Agentic ML Engineer - Photoshop	Enterprise	9	Agent orchestration · Tool use · Fine-tuning · Model serving · Multimodal · Vision · RL post-training · Reward modeling
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking · Model serving
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking · Model serving
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking · Model serving
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking · Model serving
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking · Model serving
Rubrik	Senior Machine Learning Engineer	Enterprise	9	Fine-tuning · RL post-training · Model serving · Inference infra · Synthetic data · Guardrails · LLM observability
CrowdStrike	Data Scientist, Agentic Systems (Remote)	Enterprise	9	Agent orchestration · Agent research · Tool use · Guardrails · Fine-tuning · RL post-training · LLM observability · Model serving
CrowdStrike	AI Engineer(Remote, IND)	Enterprise	9	Agent orchestration · Agent research · RAG · Vector DB · Fine-tuning · Model serving · LLM observability · Guardrails · Tool use
CrowdStrike	Lead AI Engineer, GTM Applications (Remote)	Enterprise	9	Agent orchestration · Agent research · RAG · Vector DB · LLM observability · Guardrails · Model serving · Inference infra
Oracle	Senior Principal AI Agent / ML Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving
Oracle	Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving
CrowdStrike	Director, Model Post-Training and Agentic Research (Remote)	Enterprise	9	RL post-training · Agent orchestration · Tool use · RLHF · Reward modeling · Agent research
CrowdStrike	Director, AI Alignment and Interpretability (Remote)	Enterprise	9	Interpretability · LLM observability · Guardrails
Axon	Senior Agentic AI Research Scientist	Enterprise	9	Agent orchestration · Multimodal · Vision · RAG · Vector DB · Inference infra · Model serving
Oracle	Applied Scientist 3	Enterprise	9	Synthetic data · Multimodal · Fine-tuning · LLM observability
CrowdStrike	Sr. AI/LLM Threat Researcher, Agentic Systems - AI Detection and Response (Hybrid)	Enterprise	9	Agent orchestration · Agent research · Guardrails · LLM observability · RAG · Tool use
Salesforce	Principal Architect	Enterprise	9	Agent orchestration · Multi-agent · Guardrails · LLM observability · RAG · Fine-tuning · Model serving · Agent research
Elastic	Principal Data Scientist - Agent Builder	Enterprise	9	Agent orchestration · RAG · LLM observability · Model serving · Search & ranking · Vector DB
Oracle	Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research
Oracle	Senior Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research
Oracle	Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Inference infra · Model serving
Oracle	Senior Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving · Agent research
Oracle	Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Model serving · Inference infra · Agent research
Oracle	Principal AI Agent / ML Software Engineer (OCI)	Enterprise	9	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Vector DB · Inference infra · Model serving
Okta	Staff Product Security Engineer	Enterprise	9	Agent orchestration · Agent research · Guardrails · LLM observability · Tool use
Adobe	Applied Scientist 5.5	Enterprise	9	Fine-tuning · Inference infra · Model serving · Multimodal
Box	Machine Learning Engineer III, Core Agents	Enterprise	9	Agent orchestration · RAG · LLM observability · Search & ranking
Oracle	Snr Director, Applied Science	Enterprise	9	Multimodal · Agent orchestration · Model serving · Inference infra · RAG · Guardrails · LLM observability · Vision · Audio & speech
Handshake	AI Red Teamer, CBRNE	Enterprise	9	Guardrails · Agent research

Frequently asked questions

What is Evals in AI?
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
How many AI roles reference Evals right now?
2,040 active AI roles across 208 companies in our index reference Evals as of today.
Which companies are hiring for Evals roles?
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
What AI lifecycle stage does Evals belong to?
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
What sectors invest most in Evals?
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.