New AI postings mentioning Reward Modeling per week — 81 total over 12 weeks.
106 AI roles requiring this skill.
| Company | Title | Sector | AI score | Stage |
|---|---|---|---|---|
| OpenAI | Researcher, Context - Agent Post-Training | AI Frontier | 10 | L2 |
| OpenAI | Researcher, Connectors - Agent Post-Training | AI Frontier | 10 | L2 |
| OpenAI | Researcher, Computer Use - Agent Post-Training | AI Frontier | 10 | L2 |
| OpenAI | Researcher, Interpretability | AI Frontier | 10 | L2 |
| Anthropic | Research Engineer, Domain Scaling | AI Frontier | 9 | L0 |
| Amazon | Senior Applied Scientist, Safe Locomotion, Compass | Big Tech | 9 | L4 |
| CrowdStrike | Data Scientist, Agentic Systems (Remote) | Enterprise | 9 | L4 |
| CrowdStrike | Director, Model Post-Training and Agentic Research (Remote) | Enterprise | 9 | L2 |
| OpenAI | Researcher, Agent Post-Training, Personality | AI Frontier | 9 | L2 |
| Deloitte |
| Research Engineer — Post-Training & Small Language Models (SLMs), Healthcare AI |
| Consulting |
| 9 |
| L2 |
| Amazon | Senior Applied Scientist, Selling Partner Support Engagement | Big Tech | 9 | L4 |
| Amazon | Applied Scientist, Selling Partner Support Engagement | Big Tech | 9 | L4 |
| NVIDIA | Senior Quantum Applied Research Scientist, Calibration and Decoding | Semiconductors | 9 | L2 |
| OpenAI | Researcher: Agent Post-Training, API & Power-Users | AI Frontier | 9 | L2 |
| Apple | Machine Learning Engineer, Proactive | Big Tech | 9 | L2 |
| Anthropic | Research Scientist, Life Sciences | AI Frontier | 9 | L2 |
| ByteDance | Tech Lead / Principal Engineer, Creator Agent Algorithm Infrastructure | Big Tech | 9 | L4 |
| Snorkel AI | Research Scientist - Frontier Benchmarks | Data AI | 9 | L0 |
| Wayve | Machine Learning Engineer, App SW | Robotics | 9 | L6 |
| Autodesk | Principal AI Research Scientist Post-Training Alignment | Enterprise | 9 | L2 |
| Autodesk | Principal AI Research Scientist Post-Training · Alignment · Reinforcement Learning Autodesk AI Lab: London · San Francisco · Toronto · Remote (US/CA/EU | Enterprise | 9 | L2 |
| OpenAI | Software Engineer, RL Training Infra | AI Frontier | 9 | L2 |
| OpenAI | Researcher, Artifacts - Agent Post-Training | AI Frontier | 9 | L2 |
| Amazon | Sr. Applied Scientist, AWS Just-Walk-Out Science Team | Big Tech | 9 | L4 |
| Amazon | Applied Scientist II, AWS Just-Walk-Out Science Team | Big Tech | 9 | L4 |
| Autodesk | Research Lead / Principal Scientist & Manager Post-Training · Alignment · Reinforcement Learning Autodesk AI Lab: Toronto · Remote (CA) | Enterprise | 9 | L2 |
| Amazon | Applied Scientist - Agentic AI, Amazon Fulfillment Technology | Big Tech | 9 | L4 |
| OpenAI | AI Deployment Engineer - Startups | AI Frontier | 9 | L4 |
| Cognition | Research, Post-Training | Coding AI | 9 | L2 |
| OpenAI | Researcher, Alignment Training | AI Frontier | 9 | L2 |
106 active AI roles across 32 companies mention Reward Modeling. Category: ML Ops & Evaluation.
Reward Modeling is a skill in the "ML Ops & Evaluation" category. It currently appears in 106 active AI roles across 32 companies in our index.
The top employers with active AI roles mentioning Reward Modeling are: Amazon (26), OpenAI (24), Anthropic (5), NVIDIA (4), Microsoft (4).
Over the last 12 weeks, 81 new AI postings mentioned Reward Modeling. Demand is rising — up 273% in the last four weeks compared to the earliest four weeks in the window.
Roles requiring Reward Modeling are concentrated in: agents (38%), post-training (34%), application (14%). These stages follow a seven-stage AI lifecycle from data preparation through to shipped product.
Job postings that mention Reward Modeling most often also require: Machine Learning, Reinforcement Learning (RL), LLM Evaluation & Grading, Large Language Models (LLMs), Agentic Systems.