18 AI roles tagged reward_modeling.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Amazon | Applied Scientist, Conversational Assistant Modeling and Learning | Big Tech | 9 | Agent orchestration · Tool use · Evals · Fine-tuning · Inference infra · Model serving · Multimodal · RL post-training |
| Amazon | Senior Applied Scientist, Selling Partner Support Engagement | Big Tech | 9 | Agent orchestration · RL post-training · Agent research · LLM observability · Tool use · Evals |
| Amazon | Applied Scientist, Selling Partner Support Engagement | Big Tech | 9 | Agent orchestration · RL post-training · Evals · LLM observability · Model serving |
| Research Scientist, Frontier Health, DeepMind | Big Tech | 9 | Agent orchestration · Multimodal · RL post-training · Evals · Tool use · Vision · Audio & speech | |
| Apple | Senior Machine Learning Manager, Search & Knowledge Platform | Big Tech | 9 | Fine-tuning · RL post-training · LLM observability · Model serving · Inference infra · RAG |
| Apple | Machine Learning Engineer | Big Tech | 9 | Evals · LLM observability · Guardrails · Fine-tuning · Model serving · RAG · Vector DB · Multimodal · RL post-training · RLHF |
| Staff Software Engineer, Generative AI, Core ML | Big Tech | 9 | Agent orchestration · Tool use · Evals · Fine-tuning · RL post-training · Synthetic data · Agent research · Multimodal | |
| Amazon | Senior Applied Scientist, Alexa Sensitive Content Intelligence (ASCI) | Big Tech | 9 | LLM observability · Guardrails · Evals · Agent orchestration · Agent research · RL post-training |
| Meta | AI Research Scientist - Safety Alignment Team | Big Tech | 9 | RL post-training · Fine-tuning · Multimodal · LLM observability · Evals · Guardrails · Frontier research · Interpretability · Audio & speech · Vision |
| Amazon | Sr. Principal Scientist, Amazon Health Science & Analytics | Big Tech | 9 | Frontier research · Pretraining · Multimodal · RL post-training · RAG · Inference infra · Model serving · Evals |
| Amazon | Sr. Applied Scientist, C360 | Big Tech | 8 | Fine-tuning · RL post-training · Multimodal · Recommender systems · Frontier research · Evals |
| Amazon | Senior Applied Scientist, C360 | Big Tech | 8 | Fine-tuning · RL post-training · Frontier research · Recommender systems |
| Amazon | Software Development Engineer, Sponsored Products and Brands | Big Tech | 8 | Agent orchestration · Tool use · Fine-tuning · Inference infra · Model serving · RAG · LLM observability · Guardrails · RL post-training |
| Amazon | Applied Scientist II, Alexa International Team | Big Tech | 8 | LLM observability · Fine-tuning · RLHF · Evals · Multimodal · Audio & speech · Frontier research |
| Amazon | Senior Software Development Engineer , Stores Foundational AI - Rufus | Big Tech | 8 | Fine-tuning · RL post-training · Model serving · Inference infra · Agent orchestration · Tool use · LLM observability |
| Amazon | AI Principal Product Manager-Technical, Alexa Responsible AI | Big Tech | 8 | Evals · Guardrails · RLHF · LLM observability |
| Microsoft | Member of Technical Staff - Post Training - MAI Superintelligence Team | Big Tech | 8 | RL post-training · Fine-tuning · Evals |
| Netflix | Engineering Manager, Core Applications - AI for Member Systems | Big Tech | 7 | Recommender systems · Fine-tuning · RL post-training · LLM observability |
Learning a scalar reward function — often from human or AI preference data — that scores LLM outputs during reinforcement-learning fine-tuning.
Primary AI lifecycle stage: post-training.
As of today, 31 active AI roles across 19 companies in our index reference Reward modeling. Hiring concentrates at the post-training (55%) and agents (26%) stages. Most common sectors: Big Tech, AI Frontier, Enterprise. New postings rose 36% in the last 30 days versus the prior 30 (11 → 15).
Learning a scalar reward function — often from human or AI preference data — that scores LLM outputs during reinforcement-learning fine-tuning. Primary AI lifecycle stage: post-training.
31 active AI roles across 19 companies in our index reference Reward modeling as of today. New postings rose 36% in the last 30 days versus the prior 30 (11 → 15).
The companies with the most active Reward modeling listings are: Amazon (9 roles), Adobe (2 roles), Cohere (2 roles), Deloitte (2 roles), OpenAI (2 roles).
Reward modeling primarily belongs to the post-training stage of the AI lifecycle. In current hiring, Reward modeling roles concentrate at: post-training (55%), agents (26%).
The sectors with the most active Reward modeling hiring are: Big Tech, AI Frontier, Enterprise.