Which companies are hiring for RLHF roles?

The companies with the most active RLHF listings are: Capital One (5 roles), Amazon (4 roles), Apple (3 roles), Cohere (3 roles), Pinterest (3 roles).

What AI lifecycle stage does RLHF belong to?

RLHF primarily belongs to the post-training stage of the AI lifecycle. In current hiring, RLHF roles concentrate at: post-training (37%), agents (26%).

What sectors invest most in RLHF?

The sectors with the most active RLHF hiring are: Big Tech, Consumer, AI Frontier.

← Tag co-occurrence network

RLHF

Reinforcement Learning from Human Feedback: training a reward model on human preferences, then optimizing the LLM against it. The original recipe behind ChatGPT-style helpfulness tuning.

Primary AI lifecycle stage: post-training.

As of today, 43 active AI roles across 28 companies in our index reference RLHF. Hiring concentrates at the post-training (37%) and agents (26%) stages. Most common sectors: Big Tech, Consumer, AI Frontier.

Top hiring:

Sector

All Big Tech · 16 AI Frontier · 13 Enterprise · 12 Consumer · 8 Banking · 7 Telecom · 5 Data AI · 4 Fintech · 2 Vertical AI · 1 Semiconductors · 1 Retail · 1 Pharma · 1 Media · 1 Healthcare · 1 Coding AI · 1

Function

All Engineering · 38 Research · 29 Product · 7

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorAI Frontier×

13 AI roles tagged rlhf.

Company	Title	Sector	AI score	Other tags
Anthropic	Anthropic AI Safety Fellow, UK	AI Frontier	10	Frontier research · Interpretability · Evals · Guardrails
xAI	Member of Technical Staff - Post-Training and RL	AI Frontier	9	RL post-training · Reward modeling · Fine-tuning
Anthropic	Anthropic Fellows Program — Reinforcement Learning	AI Frontier	9	RL post-training
Character AI	Research Engineer, Multimodal	AI Frontier	9	Fine-tuning · Multimodal · Vision · Audio & speech · Model serving · Inference infra · Synthetic data
Cohere	Research Engineer	AI Frontier	9	Frontier research · Fine-tuning · Evals · Model serving · Agent orchestration
Anthropic	Research Manager, Production Model Training	AI Frontier	9	Fine-tuning · Evals
Anthropic	Data Operations Manager, Human Data	AI Frontier	8	Synthetic data · Evals
xAI	Model Behavior Tutor - Style, Taste & Aesthetics	AI Frontier	7	Fine-tuning
xAI	Model Behavior Tutor - Wit & Conversation	AI Frontier	7	Evals · Fine-tuning
Anthropic	Data Operations Manager	AI Frontier	7	Agent orchestration · Tool use · Synthetic data
Cohere	Data Annotation Specialist, Modern Standard Arabic (MSA)	AI Frontier	5	Reward modeling
Mistral AI	Data Annotation Quality Specialist	AI Frontier	5	Synthetic data
Cohere	Data Annotation Specialist, Simplified Chinese / Mandarin	AI Frontier	5	Synthetic data · Reward modeling · Evals

Frequently asked questions

What is RLHF in AI?
Reinforcement Learning from Human Feedback: training a reward model on human preferences, then optimizing the LLM against it. The original recipe behind ChatGPT-style helpfulness tuning. Primary AI lifecycle stage: post-training.
How many AI roles reference RLHF right now?
43 active AI roles across 28 companies in our index reference RLHF as of today.
Which companies are hiring for RLHF roles?
The companies with the most active RLHF listings are: Capital One (5 roles), Amazon (4 roles), Apple (3 roles), Cohere (3 roles), Pinterest (3 roles).
What AI lifecycle stage does RLHF belong to?
RLHF primarily belongs to the post-training stage of the AI lifecycle. In current hiring, RLHF roles concentrate at: post-training (37%), agents (26%).
What sectors invest most in RLHF?
The sectors with the most active RLHF hiring are: Big Tech, Consumer, AI Frontier.