16 AI roles tagged rlhf.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Apple | Senior Applied Researcher | Big Tech | 9 | RAG · Fine-tuning · Evals · Guardrails · Agent orchestration · Multimodal |
| Apple | Machine Learning Engineer | Big Tech | 9 | Evals · LLM observability · Guardrails · Fine-tuning · Model serving · RAG · Vector DB · Multimodal · RL post-training · Reward modeling |
| Amazon | Sr. Applied Science Manager, Agentic AI Ads, Sponsored Products and Brands | Big Tech | 9 | Agent orchestration · Agent research · Tool use · RAG · Evals · LLM observability |
| ByteDance | Senior Research Scientist, Intelligent Editing (Multimodality) | Big Tech | 9 | Multimodal · Vision · Fine-tuning |
| Amazon | Senior Applied Scientist, Sponsored Products and Brands | Big Tech | 8 | Fine-tuning · Recommender systems · Search & ranking · RAG |
| Amazon | Senior Deep Learning Architect, Generative AI Innovation Center | Big Tech | 8 | Model serving · Fine-tuning · RAG |
| Amazon | Applied Scientist, Sponsored Products Off-Search Homepage Team | Big Tech | 8 | Recommender systems · Search & ranking · Model serving · Fine-tuning · RAG |
| Amazon | Applied Scientist II, Alexa International Team | Big Tech | 8 | LLM observability · Fine-tuning · Reward modeling · Evals · Multimodal · Audio & speech · Frontier research |
| Amazon | AI Principal Product Manager-Technical, Alexa Responsible AI | Big Tech | 8 | Evals · Guardrails · Reward modeling · LLM observability |
| Microsoft | Applied Scientist II / Senior Applied Scientist - Responsible AI (CoreAI) | Big Tech | 8 | Fine-tuning · Evals · Guardrails · Agent orchestration · Agent research |
| Amazon | Principal Applied Scientist, Sponsored Products and Brands | Big Tech | 8 | Fine-tuning · Recommender systems · Search & ranking · Model serving · RAG |
| Staff Software Engineer, Agentic Data and Evals | Big Tech | 7 | Evals · Fine-tuning · Synthetic data · Model serving | |
| Apple | Senior Manager - AI Data Operations | Big Tech | 7 | Evals · Synthetic data |
| Meta | Research Data Program Manager | Big Tech | 7 | Evals · Multimodal · Agent orchestration |
| Meta | Research Data Program Manager, MSL | Big Tech | 7 | Multimodal · Evals · Guardrails · Synthetic data |
| Apple | Senior Manager of Quality Assurance, AIML Data Operations | Big Tech | 5 |
Reinforcement Learning from Human Feedback: training a reward model on human preferences, then optimizing the LLM against it. The original recipe behind ChatGPT-style helpfulness tuning.
Primary AI lifecycle stage: post-training.
As of today, 43 active AI roles across 28 companies in our index reference RLHF. Hiring concentrates at the post-training (37%) and agents (26%) stages. Most common sectors: Big Tech, Consumer, AI Frontier.
Reinforcement Learning from Human Feedback: training a reward model on human preferences, then optimizing the LLM against it. The original recipe behind ChatGPT-style helpfulness tuning. Primary AI lifecycle stage: post-training.
43 active AI roles across 28 companies in our index reference RLHF as of today.
The companies with the most active RLHF listings are: Capital One (5 roles), Amazon (4 roles), Apple (3 roles), Cohere (3 roles), Pinterest (3 roles).
RLHF primarily belongs to the post-training stage of the AI lifecycle. In current hiring, RLHF roles concentrate at: post-training (37%), agents (26%).
The sectors with the most active RLHF hiring are: Big Tech, Consumer, AI Frontier.