Evals

Function

All Engineering · 1466 Research · 384 Product · 247

Status

Sort

2097 AI roles tagged evals.

Company	Title	Sector	AI score	Other tags
Datadog	Senior Software Engineer (AI)	Enterprise	7	Model serving · Inference infra · Guardrails · Agent orchestration · Tool use · RAG · LLM observability
Amazon	Research Engineer, AWS Agentic AI	Big Tech	7	Agent orchestration · RAG · Model serving · Inference infra
OpenAI	Software Engineer, Integrity Foundations - London	AI Frontier	7	Agent orchestration · Guardrails · LLM observability
Uber	Sr. ML Engineer	Consumer	7	RAG · Recommender systems · Search & ranking · LLM observability
Pinterest	Director of Data Science, Trust & Safety, Signals and Content Understanding	Consumer	7	Recommender systems · Search & ranking · Guardrails · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Multimodal
Weights & Biases	Senior Software Engineer II, Applied Training	Data AI	7	Inference infra · Model serving · Agent orchestration
Weights & Biases	Staff Software Engineer, Applied Training	Data AI	7	Inference infra · Model serving · Agent orchestration · Agent research
Anthropic	Head of Solutions Architects, Applied AI (Korea)	AI Frontier	7	LLM observability · Model serving
Apple	Software Engineer, Machine Learning & AI	Big Tech	7	Agent orchestration · RAG · Fine-tuning · Inference infra · Model serving
Anthropic	Applied AI Architect	AI Frontier	7	LLM observability · Agent orchestration · Model serving
Anthropic	Applied AI Architect, Federal Civilian	AI Frontier	7	LLM observability · Model serving · RAG · Agent orchestration
Meta	Research Data Program Manager, MSL	Big Tech	7	RLHF · Multimodal · Guardrails · Synthetic data
Microsoft	Research Intern - Multi-Modal Sensing & Secure AI Devices	Big Tech	7	Multimodal · Vision · Audio & speech · Agent research
Anthropic	Software Engineer, Safeguards Infrastructure	AI Frontier	7	LLM observability · Guardrails · Agent orchestration · Tool use
Wiz	Application Security Product Analyst	Enterprise	7	Agent orchestration
Amazon	Language Engineer II, Operations Team	Big Tech	7	LLM observability · Fine-tuning
Microsoft	Engineering Lead - Extreme Retrieval	Big Tech	7	RAG · Recommender systems · Search & ranking · Model serving
SoFi	Senior Staff Software Engineer, Agentic Test Platform	Fintech	7	Agent orchestration · RAG · Vector DB · Inference infra · Model serving
JPMorgan Chase	Product Director - Fraud	Banking	7	Agent orchestration · LLM observability · RAG · Vector DB · Fine-tuning · Model serving · Guardrails
Figma	Software Engineer, Code Platform	Enterprise	7	Agent orchestration · Code gen · Fine-tuning · Inference infra · Model serving
Capital One	Manager, Data Scientist - Model Risk Office	Banking	7	LLM observability · Vector DB · RAG · Fine-tuning
Anthropic	Applied AI Architect, Enterprise Tech	AI Frontier	7	LLM observability · Model serving · RAG · Agent orchestration
Block	Growth Marketing Manager, AI Creative & Content Systems	Fintech	7	Agent orchestration · Tool use
Scale AI	GenAI Strategic Projects Lead, Public Sector	Data AI	7	Fine-tuning · RL post-training · LLM observability
Replit	Field Engineer	Enterprise	7	Agent orchestration · Guardrails · Model serving · RAG
LangChain	Senior Backend Software Engineer, AI Observability & Evals Platform (LangSmith)	Data AI	7	LLM observability · Model serving · Inference infra
xAI	Model Behavior Tutor - Social Cognition & EQ	AI Frontier	7	RL post-training · Synthetic data
xAI	Model Behavior Tutor - Wit & Conversation	AI Frontier	7	RLHF · Fine-tuning
Microsoft	Senior Software Engineer - Copilot CLI	Big Tech	7	Agent orchestration · Tool use
Baseten	Software Engineer, Model Performance Systems	Data AI	7	Inference infra · Model serving