Learning a scalar reward function — often from human or AI preference data — that scores LLM outputs during reinforcement-learning fine-tuning. Primary AI lifecycle stage: post-training.
31 active AI roles across 19 companies in our index reference Reward modeling as of today. New postings rose 36% in the last 30 days versus the prior 30 (11 → 15).
The companies with the most active Reward modeling listings are: Amazon (9 roles), Adobe (2 roles), Cohere (2 roles), Deloitte (2 roles), OpenAI (2 roles).
Reward modeling primarily belongs to the post-training stage of the AI lifecycle. In current hiring, Reward modeling roles concentrate at: post-training (55%), agents (26%).
The sectors with the most active Reward modeling hiring are: Big Tech, AI Frontier, Enterprise.
Learning a scalar reward function — often from human or AI preference data — that scores LLM outputs during reinforcement-learning fine-tuning.
Primary AI lifecycle stage: post-training.
As of today, 31 active AI roles across 19 companies in our index reference Reward modeling. Hiring concentrates at the post-training (55%) and agents (26%) stages. Most common sectors: Big Tech, AI Frontier, Enterprise. New postings rose 36% in the last 30 days versus the prior 30 (11 → 15).
3 AI roles tagged reward_modeling.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Verizon | Director of Digital Customer Experience & AI Innovation | Telecom | 7 | LLM observability · Agent orchestration · Guardrails · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI |
| AT&T | Senior Full Stack/AI Engineer | Telecom | 5 | LLM observability · Agent orchestration · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Multimodal · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI · Code gen |
| AT&T | Full-Stack Software Engineer | Telecom | 5 | Agent orchestration · Tool use · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Multimodal · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · RLHF · RL robotics · Embodied AI · Code gen |