10 AI roles tagged reward_modeling.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| xAI | Member of Technical Staff - Post-Training and RL | AI Frontier | 9 | RL post-training · RLHF · Fine-tuning |
| Black Forest Labs | Member of Technical Staff - Post Training | Multimodal | 9 | Fine-tuning · RL post-training · Evals · Multimodal · Distillation · Model serving |
| Zillow | Principal Applied Scientist, Agentic AI | Consumer | 9 | RL post-training · RLHF · Fine-tuning · Guardrails · Agent orchestration · Evals · Multimodal · Vector DB |
| Crusoe | Senior Staff Software Engineer, AI Model LifeCycle | Data AI | 9 | Fine-tuning · Pretraining · RL post-training · Multimodal · LLM observability |
| Anthropic | Research Engineer, Reward Models Platform | AI Frontier | 9 | RL post-training · Fine-tuning · Evals · LLM observability |
| Anthropic | Team Manager, Alignment RL | AI Frontier | 9 | RL post-training · Synthetic data |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI |
| Cohere | Data Annotation Specialist, Modern Standard Arabic (MSA) | AI Frontier | 5 | RLHF |
| AT&T | Full-Stack Software Engineer | Telecom | 5 | Agent orchestration · Tool use · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Multimodal · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI · Code gen |
| Cohere | Data Annotation Specialist, Simplified Chinese / Mandarin | AI Frontier | 5 | Synthetic data · RLHF · Evals |