40 AI roles tagged rlhf.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| Anthropic | Anthropic AI Safety Fellow, UK | AI Frontier | 10 | Frontier research · Interpretability · Evals · Guardrails |
| Cognition | Research, Post-Training | Coding AI | 9 | RL post-training · Reward modeling · Evals · Agent research · Agent orchestration |
| xAI | Member of Technical Staff - Post-Training and RL | AI Frontier | 9 | RL post-training · Reward modeling · Fine-tuning |
| GE Healthcare | AI Algorithm and Development Software Engineer | Healthcare | 9 | Fine-tuning · Tool use · Agent orchestration · Agent research · Evals · LLM observability · Multimodal |
| Anthropic | Anthropic Fellows Program — Reinforcement Learning | AI Frontier | 9 | RL post-training |
| NVIDIA | Senior Deep Learning Scientist, Multimodal Conversational AI | Semiconductors | 9 | Multimodal · Fine-tuning · Vision · Audio & speech · Embodied AI · Agent orchestration · Tool use |
| Weights & Biases | VP of Product, Research and Training Infrastructure | Data AI | 9 | Frontier research · Pretraining · RL post-training · Inference infra · Model serving |
| Zillow | Principal Applied Scientist, Agentic AI | Consumer | 9 | RL post-training · Reward modeling · Fine-tuning · Guardrails · Agent orchestration · Evals · Multimodal · Vector DB |
| Character AI | Research Engineer, Multimodal | AI Frontier | 9 | Fine-tuning · Multimodal · Vision · Audio & speech · Model serving · Inference infra · Synthetic data |
| Capital One | Applied Researcher I (AI Foundations) | Banking | 9 | Pretraining · Fine-tuning · Frontier research · Vector DB |
| Capital One | Applied Researcher II | Banking | 9 | Fine-tuning · Frontier research · Vector DB · Pretraining |
| Canva | Senior Research Scientist - Reinforcement Learning, MoEs | Enterprise | 9 | RL post-training · Reward modeling · Agent orchestration · Tool use · Multimodal · Model serving · Frontier research · Evals |
| Cohere | Research Engineer | AI Frontier | 9 | Frontier research · Fine-tuning · Evals · Model serving · Agent orchestration |
| Datadog | AI Research Engineer - Datadog AI Research (DAIR) | Enterprise | 9 | Multimodal · Frontier research · RL post-training · Agent orchestration · Model serving · Inference infra · Evals · Synthetic data |
| Anthropic | Research Manager, Production Model Training | AI Frontier | 9 | Fine-tuning · Evals |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| Capital One | Applied Researcher I | Banking | 8 | Fine-tuning · Frontier research · Vector DB |
| ServiceNow | Staff Machine Learning Engineer, Agentic AI Systems - Moveworks | Enterprise | 8 | Agent orchestration · Tool use · Evals · Fine-tuning · Model serving · Agent research · LLM observability · Multimodal |
| Canva | Senior Machine Learning Engineer - Multimodal Data | Enterprise | 8 | Multimodal · Agent orchestration · Fine-tuning · Synthetic data · LLM observability |
| Staff Product Manager, AI Safety | Consumer | 8 | Evals · Guardrails · LLM observability · Multimodal · Agent research | |
| Plaid | Staff Software Engineer - AI Applications | Fintech | 8 | Agent orchestration · RAG · Vector DB · Fine-tuning · LLM observability · Agent research |
| Capital One | Applied Researcher I | Banking | 8 | Fine-tuning · Frontier research · Interpretability · Vector DB · Recommender systems · Model serving |
| Capital One | Applied Researcher II (AI Foundations) | Banking | 8 | Pretraining · Fine-tuning · Vector DB |
| Capital One | Applied Researcher I (AI Foundations) | Banking | 8 | Pretraining · Fine-tuning · Vector DB · Frontier research · Interpretability |
| Walmart | Senior, Data Scientist | Retail | 8 | Evals · Vision · Multimodal · Fine-tuning · Reward modeling |
| Handshake | AI Tutor, Electrochemistry & Functional Materials Specialist (contract), Handshake AI | Enterprise | 7 | Evals · Guardrails |
| xAI | Model Behavior Tutor - Style, Taste & Aesthetics | AI Frontier | 7 | Fine-tuning |
| xAI | Model Behavior Tutor - Wit & Conversation | AI Frontier | 7 | Evals · Fine-tuning |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · Reward modeling · RL robotics · Embodied AI |