24 AI roles tagged reward_modeling.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Cognition | Research, Post-Training | Coding AI | 9 | RL post-training · RLHF · Evals · Agent research · Agent orchestration |
| xAI | Member of Technical Staff - Post-Training and RL | AI Frontier | 9 | RL post-training · RLHF · Fine-tuning |
| Black Forest Labs | Member of Technical Staff - Post Training | Multimodal | 9 | Fine-tuning · RL post-training · Evals · Multimodal · Distillation · Model serving |
| Snorkel AI | Research Scientist - RL Training | Data AI | 9 | RL post-training · Fine-tuning · Synthetic data |
| Zillow | Principal Applied Scientist, Agentic AI | Consumer | 9 | RL post-training · RLHF · Fine-tuning · Guardrails · Agent orchestration · Evals · Multimodal · Vector DB |
| Crusoe | Senior Staff Software Engineer, AI Model LifeCycle | Data AI | 9 | Fine-tuning · Pretraining · RL post-training · Multimodal · LLM observability |
| Baseten | Post-Training Applied Researcher | Data AI | 9 | Fine-tuning · RL post-training · Agent orchestration · Tool use · Evals · Synthetic data · Model serving |
| OpenAI | Research Engineer/Scientist - Human Alignment, Consumer Devices | AI Frontier | 9 | RL post-training · Multimodal · Evals |
| Canva | Senior Research Scientist - Reinforcement Learning, MoEs | Enterprise | 9 | RL post-training · RLHF · Agent orchestration · Tool use · Multimodal · Model serving · Frontier research · Evals |
| Canva | Senior Research Scientist - Reinforcement Learning, MoEs | Enterprise | 9 | RL post-training · Frontier research · Agent orchestration · Multimodal · Model serving · Fine-tuning · Evals · Agent research |
| Anthropic | Research Engineer, Environment Scaling | AI Frontier | 9 | RL post-training · Fine-tuning · Synthetic data · Evals |
| Anthropic | Senior Research Scientist, Reward Models | AI Frontier | 9 | RL post-training · Evals · LLM observability · Frontier research |
| Anthropic | Research Engineer, Reward Models Platform | AI Frontier | 9 | RL post-training · Fine-tuning · Evals · LLM observability |
| Anthropic | Research Engineer, Virtual Collaborator (Cowork) | AI Frontier | 9 | RL post-training · Synthetic data · Evals |
| Anthropic | Research Engineer, Reward Models | AI Frontier | 9 | RL post-training · Fine-tuning · LLM observability |
| Scale AI | Machine Learning Research Scientist, Post-Training | Data AI | 9 | RL post-training · Fine-tuning · Multimodal · Evals · Frontier research |
| Anthropic | Team Manager, Alignment RL | AI Frontier | 9 | RL post-training · Synthetic data |
| Adobe | Senior Applied Scientist | Enterprise | 8 | Fine-tuning · RL post-training · Model serving · Multimodal · Vision |
| Walmart | Senior, Data Scientist | Retail | 8 | Evals · Vision · Multimodal · Fine-tuning · RLHF |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI |
| Cohere | Data Annotation Specialist, Modern Standard Arabic (MSA) | AI Frontier | 5 | RLHF |
| AT&T | Full-Stack Software Engineer | Telecom | 5 | Agent orchestration · Tool use · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Multimodal · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI · Code gen |
| Replit | Product Lead, Growth Marketing | Enterprise | 5 | Agent orchestration · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Multimodal · Evals · Guardrails · LLM observability · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI |
| Cohere | Data Annotation Specialist, Simplified Chinese / Mandarin | AI Frontier | 5 | Synthetic data · RLHF · Evals |