19 AI roles tagged rlhf.
| Company | Title | Sector | AI score | Other tags |
|---|---|---|---|---|
| xAI | Member of Technical Staff - Post-Training and RL | AI Frontier | 9 | RL post-training · Reward modeling · Fine-tuning |
| GE Healthcare | AI Algorithm and Development Software Engineer | Healthcare | 9 | Fine-tuning · Tool use · Agent orchestration · Agent research · Evals · LLM observability · Multimodal |
| NVIDIA | Senior Deep Learning Scientist, Multimodal Conversational AI | Semiconductors | 9 | Multimodal · Fine-tuning · Vision · Audio & speech · Embodied AI · Agent orchestration · Tool use |
| Zillow | Principal Applied Scientist, Agentic AI | Consumer | 9 | RL post-training · Reward modeling · Fine-tuning · Guardrails · Agent orchestration · Evals · Multimodal · Vector DB |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| Moveworks | Senior Machine Learning Engineer II, NLU & Agentic AI | Enterprise | 9 | Agent orchestration · Agent research · Fine-tuning · Evals · Multimodal · Model serving · LLM observability |
| ServiceNow | Staff Machine Learning Engineer, Agentic AI Systems - Moveworks | Enterprise | 8 | Agent orchestration · Tool use · Evals · Fine-tuning · Model serving · Agent research · LLM observability · Multimodal |
| Canva | Senior Machine Learning Engineer - Multimodal Data | Enterprise | 8 | Multimodal · Agent orchestration · Fine-tuning · Synthetic data · LLM observability |
| Plaid | Staff Software Engineer - AI Applications | Fintech | 8 | Agent orchestration · RAG · Vector DB · Fine-tuning · LLM observability · Agent research |
| xAI | Model Behavior Tutor - Style, Taste & Aesthetics | AI Frontier | 7 | Fine-tuning |
| xAI | Model Behavior Tutor - Wit & Conversation | AI Frontier | 7 | Evals · Fine-tuning |
| Whatnot | Software Engineer, Trust & Risk | Consumer | 7 | Agent orchestration · Evals · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Inference infra · Model serving · Recommender systems · Search & ranking · Vision · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · Reward modeling · RL robotics · Embodied AI |
| Anthropic | Data Operations Manager | AI Frontier | 7 | Agent orchestration · Tool use · Synthetic data |
| Scale AI | Forward Deployed Engineer, GenAI | Data AI | 7 | |
| Scale AI | Senior Software Engineer, GenAI | Data AI | 7 | |
| Cohere | Data Annotation Specialist, Modern Standard Arabic (MSA) | AI Frontier | 5 | Reward modeling |
| Mistral AI | Data Annotation Quality Specialist | AI Frontier | 5 | Synthetic data |
| AT&T | Full-Stack Software Engineer | Telecom | 5 | Agent orchestration · Tool use · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Recommender systems · Search & ranking · Vision · Multimodal · Audio & speech · Frontier research · Interpretability · Synthetic data · Agent research · RL post-training · Reward modeling · RL robotics · Embodied AI · Code gen |
| Cohere | Data Annotation Specialist, Simplified Chinese / Mandarin | AI Frontier | 5 | Synthetic data · Reward modeling · Evals |