Which companies are hiring for Evals roles?

The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).

What AI lifecycle stage does Evals belong to?

Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).

What sectors invest most in Evals?

The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.

← Tag co-occurrence network

Evals

Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.

Primary AI lifecycle stage: evaluation.

As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.

Top hiring:

Function

All Engineering · 2406 Research · 518 Product · 411

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorFintech×

125 AI roles tagged evals.

Company	Title	Sector	AI score	Other tags
Block	Principal Engineer, AI Systems	Fintech	9	Agent orchestration · Tool use · RAG · Fine-tuning · LLM observability · Agent research
Ramp	Senior Growth Operator, Partner	Fintech	9	Agent orchestration · Tool use · Guardrails
SoFi	Staff AI Engineer	Fintech	9	Agent orchestration · Tool use · LLM observability · RAG
Upstart	Principal Engineer, LLM	Fintech	9	Inference infra · Model serving · RAG · Vector DB · LLM observability
Ramp	Agentic Operator, Growth Marketing	Fintech	9	Agent orchestration · Tool use · Guardrails · RAG · Fine-tuning · LLM observability
Gusto	Sr. Staff AI/ML Engineer	Fintech	9	Agent orchestration · RAG · Model serving · LLM observability · Guardrails
PayPal	Staff Machine Learning Engineer	Fintech	8	Agent orchestration · Tool use · Guardrails · Fine-tuning · Recommender systems · Agent research
Plaid	Engineering Manager, AI Applications	Fintech	8	Agent orchestration · Fine-tuning · RAG · Vector DB · LLM observability · RLHF · Agent research
Plaid	Staff Software Engineer - Instant Access	Fintech	8	Agent orchestration · LLM observability · Guardrails · Model serving · Code gen
Block	Staff Applied Machine Learning Engineer - Fraud & Abuse	Fintech	8	Model serving · Inference infra · Agent orchestration
Block	Senior Machine Learning Engineer, Model Risk Management	Fintech	8	Agent orchestration · LLM observability
Robinhood	Senior Software Engineer	Fintech	8	Model serving · Inference infra · Fine-tuning
Ripple	Staff Software Engineer, GenAI Platform	Fintech	8	Agent orchestration · Tool use · Guardrails · LLM observability · RAG · Inference infra · Model serving
Robinhood	Senior Engineering Manager, Agentic AI	Fintech	8	Agent orchestration · Tool use · LLM observability · Model serving
Gusto	AI Solutions Architect	Fintech	8	Agent orchestration · Tool use · Guardrails · LLM observability
Visa	Senior Manager- (Agentic AI Development/Solution/Client facing)	Fintech	8	Agent orchestration · Tool use · RAG · Fine-tuning · LLM observability
Betterment	Sr. Staff Engineer	Fintech	8	Agent orchestration · LLM observability · Guardrails · Model serving
Visa	Staff ML Scientist	Fintech	8	Model serving · Fine-tuning
Visa	Senior Machine Learning Scientist	Fintech	8	Fine-tuning · Model serving
Visa	Client Consulting Analyst	Fintech	8	Agent orchestration · RAG · Guardrails · Fine-tuning
SoFi	Director, AI Platforms	Fintech	8	Model serving · Inference infra · Agent orchestration · RAG · LLM observability · Guardrails
Robinhood	Staff Product Manager, Cortex	Fintech	8	Agent orchestration · RAG · Guardrails · LLM observability
Stripe	Machine Learning Engineer, Support Experience	Fintech	8	Agent orchestration · Tool use · RAG · Fine-tuning · LLM observability
Polymarket	AI Ops Specialist	Fintech	8	Agent orchestration · Tool use · RAG · LLM observability
Mercury	Senior Software Engineer - AI Engineering	Fintech	8	Agent orchestration · RAG · Guardrails · LLM observability · Model serving · Inference infra
PayPal	Sr Machine Learning Engineer	Fintech	8	Agent orchestration · Tool use · Agent research · LLM observability · RAG · Fine-tuning · Model serving
Carta	Senior Machine Learning Engineer II	Fintech	8	Agent orchestration · LLM observability
Upstart	Principal Product Manager, Agentic Platform	Fintech	8	Agent orchestration · Tool use · RAG · Guardrails · LLM observability
PayPal	Sr Machine Learning Engineer — Agentic Systems	Fintech	8	Agent orchestration · Tool use · Guardrails · RAG · Fine-tuning · Model serving
SoFi	Principal Product Manager, AI SDLC	Fintech	8	Agent orchestration · Tool use · LLM observability · Fine-tuning · Model serving

Frequently asked questions

What is Evals in AI?
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
How many AI roles reference Evals right now?
2,040 active AI roles across 208 companies in our index reference Evals as of today.
Which companies are hiring for Evals roles?
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
What AI lifecycle stage does Evals belong to?
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
What sectors invest most in Evals?
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.