1,136 active AI roles across 160 companies mention LLM Evaluation & Grading. Category: LLM & Foundation Models.
New AI postings mentioning LLM Evaluation & Grading per week — 809 total over 12 weeks.
LLM Evaluation & Grading is a skill in the "LLM & Foundation Models" category. It currently appears in 1,136 active AI roles across 160 companies in our index.
The top employers with active AI roles mentioning LLM Evaluation & Grading are: Google (161), Amazon (105), Capital One (89), OpenAI (50), JPMorgan Chase (34).
Over the last 12 weeks, 809 new AI postings mentioned LLM Evaluation & Grading. Demand is rising — up 237% in the last four weeks compared to the earliest four weeks in the window.
Roles requiring LLM Evaluation & Grading are concentrated in: agents (49%), serving infrastructure (15%), application (12%). These stages follow a seven-stage AI lifecycle from data preparation through to shipped product.
Job postings that mention LLM Evaluation & Grading most often also require: Machine Learning, Large Language Models (LLMs), Agentic Systems, Python, Production ML Systems.
11 AI roles requiring this skill.
| Company | Title | Sector | AI score | Stage |
|---|---|---|---|---|
| Anthropic | Product Manager, Claude Code Model Performance | AI Frontier | 9 | L4 |
| Cohere | Product Manager, Safety Research | AI Frontier | 9 | L6 |
| OpenAI | Agentic Risk Analyst | AI Frontier | 8 | L4 |
| Anthropic | Applied AI Architect Lead, EMEA Commercial | AI Frontier | 8 | L6 |
| Sierra | Product Manager, Agent Studio | AI Frontier | 8 | L4 |
| Sierra | Product Manager, Ghostwriter | AI Frontier | 8 | L4 |
| OpenAI | Data Scientist, Safety Systems | AI Frontier | 8 | L5 |
| Harvey | Senior Product Operations Manager, Evaluation | AI Frontier | 7 | L5 |
| Sierra | Agent Experience Designer, Voice (Multilingual) | AI Frontier | 7 | L6 |
| OpenAI | AI Deployment Manager - NYC | AI Frontier | 7 | L6 |
| Anthropic | Applied AI Architect, Enterprise Tech | AI Frontier | 7 | L4 |