35 AI roles tagged rlhf.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Cognition | Research, Post-Training | Coding AI | 9 | RL post-training · Reward modeling · Evals · Agent research · Agent orchestration |
| xAI | Member of Technical Staff - Post-Training and RL | AI Frontier | 9 | RL post-training · Reward modeling · Fine-tuning |
| GE Healthcare | AI Algorithm and Development Software Engineer | Healthcare | 9 | Fine-tuning · Tool use · Agent orchestration · Agent research · Evals · LLM observability · Multimodal |
| Anthropic | Anthropic Fellows Program — Reinforcement Learning | AI Frontier | 9 | RL post-training |
| Weights & Biases | VP of Product, Research and Training Infrastructure | Data AI | 9 | Frontier research · Pretraining · RL post-training · Inference infra · Model serving |
| Zillow | Principal Applied Scientist, Agentic AI | Consumer | 9 | RL post-training · Reward modeling · Fine-tuning · Guardrails · Agent orchestration · Evals · Multimodal · Vector DB |
| Character AI | Research Engineer, Multimodal | AI Frontier | 9 | Fine-tuning · Multimodal · Vision · Audio & speech · Model serving · Inference infra · Synthetic data |
| Capital One | Applied Researcher I (AI Foundations) | Banking | 9 | Pretraining · Fine-tuning · Frontier research · Vector DB |
| Capital One | Applied Researcher II | Banking | 9 | Fine-tuning · Frontier research · Vector DB · Pretraining |
| Canva | Senior Research Scientist - Reinforcement Learning, MoEs | Enterprise | 9 | RL post-training · Reward modeling · Agent orchestration · Tool use · Multimodal · Model serving · Frontier research · Evals |
| Cohere | Research Engineer | AI Frontier | 9 | Frontier research · Fine-tuning · Evals · Model serving · Agent orchestration |
| Datadog | AI Research Engineer - Datadog AI Research (DAIR) | Enterprise | 9 | Multimodal · Frontier research · RL post-training · Agent orchestration · Model serving · Inference infra · Evals · Synthetic data |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| Capital One | Applied Researcher I | Banking | 8 | Fine-tuning · Frontier research · Vector DB |
| Canva | Senior Machine Learning Engineer - Multimodal Data | Enterprise | 8 | Multimodal · Agent orchestration · Fine-tuning · Synthetic data · LLM observability |
| Staff Product Manager, AI Safety | Consumer | 8 | Evals · Guardrails · LLM observability · Multimodal · Agent research | |
| Plaid | Staff Software Engineer - AI Applications | Fintech | 8 | Agent orchestration · RAG · Vector DB · Fine-tuning · LLM observability · Agent research |
| Capital One | Applied Researcher I | Banking | 8 | Fine-tuning · Frontier research · Interpretability · Vector DB · Recommender systems · Model serving |
| Capital One | Applied Researcher II (AI Foundations) | Banking | 8 | Pretraining · Fine-tuning · Vector DB |
| Capital One | Applied Researcher I (AI Foundations) | Banking | 8 | Pretraining · Fine-tuning · Vector DB · Frontier research · Interpretability |
| Walmart | Senior, Data Scientist | Retail | 8 | Evals · Vision · Multimodal · Fine-tuning · Reward modeling |
| Handshake | AI Tutor, Electrochemistry & Functional Materials Specialist (contract), Handshake AI | Enterprise | 7 | Evals · Guardrails |
| xAI | Model Behavior Tutor - Style, Taste & Aesthetics | AI Frontier | 7 | Fine-tuning |
| xAI | Model Behavior Tutor - Wit & Conversation | AI Frontier | 7 | Evals · Fine-tuning |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · Reward modeling · RL robotics · Embodied AI |
| Scale AI | Forward Deployed Engineer, GenAI | Data AI | 7 | |
| Scale AI | Senior Software Engineer, GenAI | Data AI | 7 | |
| Cohere | Data Annotation Specialist, Modern Standard Arabic (MSA) | AI Frontier | 5 | Reward modeling |
| Mistral AI | Data Annotation Quality Specialist | AI Frontier | 5 | Synthetic data |