Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review.
Primary AI lifecycle stage: evaluation.
As of today, 2,040 active AI roles across 208 companies in our index reference Evals. Hiring concentrates at the agents (57%) and evaluation (12%) stages. Most common sectors: Big Tech, Enterprise, AI Frontier.
Designing benchmarks and automated scoring systems to measure model quality, safety, or capability — typically blending classical metrics, LLM-as-judge, and human review. Primary AI lifecycle stage: evaluation.
2,040 active AI roles across 208 companies in our index reference Evals as of today.
The companies with the most active Evals listings are: Amazon (188 roles), Google (153 roles), OpenAI (95 roles), Microsoft (73 roles), JPMorgan Chase (70 roles).
Evals primarily belongs to the evaluation stage of the AI lifecycle. In current hiring, Evals roles concentrate at: agents (57%), evaluation (12%).
The sectors with the most active Evals hiring are: Big Tech, Enterprise, AI Frontier.
69 AI roles tagged evals.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Wayve | Engineering Internship, Enrichment and Curation | Robotics | 9 | Embodied AI · Multimodal · Vision · Pretraining · Fine-tuning |
| Wayve | Tech Lead, Autonomy Performance - Robotaxi | Robotics | 9 | Embodied AI · Model serving · Inference infra |
| Apptronik | Staff MLOps Engineer | Robotics | 9 | Model serving · Inference infra |
| Wayve | Machine Learning Engineer, App SW | Robotics | 9 | Embodied AI · Model serving · Synthetic data |
| Wayve | Director AV Product Engineering | Robotics | 9 | Embodied AI · Model serving · Fine-tuning |
| Wayve | Staff Machine Learning Engineer, AV Core | Robotics | 9 | Embodied AI · Multimodal · Vision · Fine-tuning · Model serving · Interpretability · Pretraining · Agent orchestration |
| Wayve | Applied Scientist, Wayve Labs | Robotics | 9 | Embodied AI · Multimodal · Vision · Frontier research · Model serving |
| Wayve | Tech Lead, ML Engineer - AV Product engineering | Robotics | 9 | Embodied AI · Multimodal · Vision · Model serving · Fine-tuning |
| Wayve | Principal Machine Learning Engineer, App SW | Robotics | 9 | Embodied AI · Model serving · Inference infra · Synthetic data · Fine-tuning |
| Wayve | Machine Learning Engineer, AV Engineering | Robotics | 9 | Embodied AI · Fine-tuning · Synthetic data |
| Wayve | Machine Learning Engineer, App SW | Robotics | 9 | Embodied AI · Model serving · Synthetic data |
| Wayve | Machine Learning Engineer | Robotics | 9 | Embodied AI · Model serving · Inference infra · Synthetic data · Fine-tuning |
| Figure AI | AI Engineer, Post-Training - Helix Team | Robotics | 9 | Fine-tuning · Guardrails · RL post-training · Embodied AI |
| Applied Intuition | Research Engineer - AI/RL Infrastructure | Robotics | 9 | Training infra · Multimodal |
| Wayve | Director of Platform Management for Simulation, Evaluation & Validation | Robotics | 8 | Embodied AI · Synthetic data |
| Nuro | Technical Lead, Evaluation Infrastructure | Robotics | 8 | LLM observability |
| Wayve | Autonomy Systems & Release Engineer | Robotics | 8 | Embodied AI · Model serving |
| Applied Intuition | Engineer Manager - ML Data and Evaluation, Self-Driving Systems | Robotics | 8 | Multimodal |
| Waabi | Senior / Staff Applied Scientist | Robotics | 8 | |
| Applied Intuition | Engineering Manager - ML, Self-Driving Systems | Robotics | 8 | Model serving · Inference infra · Fine-tuning |
| Agility Robotics | Staff AI Engineer, Perception | Robotics | 8 | Vision · Model serving · Inference infra |
| Applied Intuition | Sensor Sim - ML Engineer | Robotics | 8 | Agent orchestration |
| Applied Intuition | Software Engineer - AI Engineering | Robotics | 8 | Agent orchestration · Tool use · Inference infra · Model serving · LLM observability |
| Agility Robotics | AI Intern | Robotics | 8 | RL robotics · Embodied AI · Fine-tuning |
| Applied Intuition | Senior Autonomy Engineer | Robotics | 8 | Model serving · Inference infra · Fine-tuning |
| Nuro | Technical Lead Manager, Autonomy Evaluation and Intelligence | Robotics | 8 | Agent research · Embodied AI · Agent orchestration · Model serving |
| Applied Intuition | Software Engineer - E2E Autonomy | Robotics | 8 | Inference infra · Model serving |
| Joby Aviation | Flight Research Senior Machine Learning Engineer | Robotics | 8 | Fine-tuning · Inference infra · Model serving |
| Joby Aviation | Staff Flight Research Machine Learning Engineer | Robotics | 8 | Model serving · Inference infra |
| Joby Aviation | Senior AI Engineer | Robotics | 8 | Agent orchestration · Model serving · Inference infra · RAG · Vector DB · LLM observability · Guardrails |