2097 AI roles tagged evals.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Scale AI | Manager, Machine Learning Research Scientist, GenAI | Data AI | 9 | RL post-training · Agent research · Frontier research · Multimodal |
| Anthropic | Research Engineer, Model Evaluations | AI Frontier | 9 | LLM observability · Fine-tuning · Model serving · Agent research |
| OpenAI | Software Engineer, Distributed Data Systems - Robotics | AI Frontier | 9 | Multimodal |
| Meta | Research Scientist Manager, MetaAI Assistant Measurement | Big Tech | 9 | LLM observability · Multimodal |
| Anthropic | Research Engineer, Model Evaluations | AI Frontier | 9 | LLM observability · Fine-tuning · Model serving · Agent research · Guardrails · RL post-training |
| Scale AI | Evals Engineer, Applied AI | Data AI | 9 | LLM observability · Agent research · Fine-tuning · RL post-training |
| Anthropic | Research Engineer, Production Model Post-Training, London | AI Frontier | 9 | RL post-training · Fine-tuning · Model serving |
| Cohere | Forward Deployed Engineer, Prompt Specialist | AI Frontier | 9 | Agent orchestration · Tool use · RAG |
| Scale AI | Staff Machine Learning Research Scientist, LLM Evals | Data AI | 9 | Frontier research · LLM observability · Fine-tuning |
| Zillow | AI Applied Scientist - PhD Intern, Generative Computer Vision | Consumer | 9 | Vision · Multimodal · Fine-tuning |
| Anthropic | Research Engineer, Virtual Collaborator (Cowork) | AI Frontier | 9 | RL post-training · Reward modeling · Synthetic data |
| xAI | Member of Technical Staff - Mid-training | AI Frontier | 9 | Synthetic data · Multimodal · RL post-training |
| Scale AI | Machine Learning Research Engineer, Agent Data Foundation - Enterprise GenAI | Data AI | 9 | Synthetic data · RL post-training · Agent orchestration · Fine-tuning |
| Scale AI | Engineering Manager, AgentOps | Data AI | 9 | Agent orchestration · Agent research · Guardrails · RL post-training |
| OpenAI | Researcher, Pretraining Safety | AI Frontier | 9 | Pretraining · Frontier research · Model serving |
| Fireworks AI | Member of Technical Staff, Evals & Post-Training Product | Data AI | 9 | Fine-tuning |
| Zillow | AI Applied Scientist - PhD Intern, Foundational IQ | Consumer | 9 | Fine-tuning · Multimodal · Agent orchestration |
| Zillow | AI Applied Scientist - PhD Intern, 3D Computer Vision | Consumer | 9 | Vision · Multimodal · Fine-tuning |
| OpenAI | Offensive Security Engineer, Agent Products | AI Frontier | 9 | Agent orchestration · Tool use · Guardrails · Model serving · Inference infra |
| Gusto | Sr. Staff AI/ML Engineer | Fintech | 9 | Agent orchestration · RAG · Model serving · LLM observability · Guardrails |
| Cohere | Senior Research Scientist, Model Evaluation | AI Frontier | 9 | LLM observability · Fine-tuning |
| Wayve | Machine Learning Engineer | Robotics | 9 | Embodied AI · Model serving · Inference infra · Synthetic data · Fine-tuning |
| Anthropic | ML/Research Engineer, Safeguards | AI Frontier | 9 | Agent orchestration · Guardrails · Synthetic data · Agent research |
| Anthropic | Research Operations & Strategy Lead - Coding & Cybersecurity Data | AI Frontier | 9 | Agent research · Agent orchestration · Fine-tuning · RL post-training |
| Anthropic | Data Operations Manager - Computer Use & Tool Use | AI Frontier | 9 | Agent orchestration · RL post-training · Tool use · Agent research |
| Anthropic | Privacy Research Engineer, Safeguards | AI Frontier | 9 | Fine-tuning · RL post-training · Interpretability |
| Character AI | Research Engineer, AI Safety & Alignment | AI Frontier | 9 | Interpretability · RL post-training · Fine-tuning · Guardrails · LLM observability |
| OpenAI | Technical Lead, Safety Research | AI Frontier | 9 | RL post-training · Guardrails · Frontier research · Interpretability |
| OpenAI | Data Scientist, Codex | AI Frontier | 9 | Agent orchestration · Code gen |
| Anthropic | Research Engineer, Pretraining Scaling - London | AI Frontier | 9 | Pretraining · Model serving · Inference infra · LLM observability |