AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown

Currently tracking 82 active AI roles, up 61% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $139k–$393k (avg $256k).

Hiring
82 / 85
Momentum (4w)
↑+14 +61%
37 opens last 4w · 23 prior 4w
Salary range · avg $256k
$139k–$393k
USD · disclosed roles only
Tracked since
Jun '23
last role yesterday
Hiring velocityscroll left for older weeks
1 new role
Jun 5
1 new role
12
1 new role
Aug 14
1 new role
Feb 12
2 new roles
Apr 29
1 new role
Sep 9
1 new role
Oct 7
1 new role
21
1 new role
Nov 11
2 new roles
Jan 6
1 new role
13
1 new role
27
2 new roles
Feb 10
2 new roles
Mar 3
1 new role
10
1 new role
Apr 21
1 new role
May 12
1 new role
26
1 new role
Jun 2
1 new role
Jul 21
1 new role
28
3 new roles
Aug 4
2 new roles
11
3 new roles
18
2 new roles
Sep 1
1 new role
8
1 new role
15
1 new role
22
1 new role
29
2 new roles
Oct 6
5 new roles
13
9 new roles
27
1 new role
Nov 3
2 new roles
10
5 new roles
17
5 new roles
Dec 1
1 new role
8
1 new role
22
2 new roles
29
1 new role
Jan 5
7 new roles
12
4 new roles
19
7 new roles
26
5 new roles
Feb 2
5 new roles
9
7 new roles
16
2 new roles
23
8 new roles
Mar 2
5 new roles
9
6 new roles
16
12 new roles
23
3 new roles
30
5 new roles
Apr 6
3 new roles
13
5 new roles
20
11 new roles
27
15 new roles
May 4
6 new roles
11
Scale AI

Scale AI

Data AI · Data labeling

HQ
San Francisco, US
Founded
2016
Size
1,500+
Website
scale.com
Products
  • Scale Data Engine
Competitors
  • Databricks · Data AI

Jobs (13)

82 AI · 179 total active
FilteredStageEval Gate×
Show
Active onlyAI only (≥ 7)
Stage
AllData · 12Pretrain · 1Post-train · 11Serve · 9Agent · 38Eval Gate · 13Ship · 26
Function
AllEngineering · 79Product · 69Research · 31
Country
AllUnited States · 124United Kingdom · 18Mexico · 8Qatar · 6Argentina · 2India · 2Saudi Arabia · 2Hungary · 1
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Research Scientist, Frontier Risk Evaluations
Research Scientist role focused on designing and building evaluation measures, harnesses, and datasets for frontier AI systems, with a focus on identifying and mitigating risks. The role involves collaboration with external agencies and publishing findings, bridging AI research and policy.
Eval GateAgentResearchSan Francisco, CA7w ago9
Research Scientist, AI Controls and Monitoring
Research Scientist role focused on designing methods, systems, and experiments for AI controls and monitoring, ensuring advanced AI models and agents remain aligned with intended goals, even in high-stakes or adversarial environments. This includes developing monitoring techniques, researching layered control mechanisms, designing red-team simulations, and collaborating with policymakers.
Eval Gate
Post-train
Research
San Francisco, CA
8w ago
9
Staff Machine Learning Research Scientist, LLM Evals
Scale AI is seeking a Staff Machine Learning Research Scientist to lead the development of novel evaluation methodologies, metrics, and benchmarks for large language models (LLMs). This role focuses on defining and measuring the capabilities and limitations of frontier LLMs, driving research that informs internal roadmaps and the broader community. Responsibilities include researching existing evaluation techniques, designing new benchmarks, implementing scalable evaluation pipelines, publishing findings, and mentoring junior researchers. The ideal candidate has 5+ years of experience in LLMs/NLP, a strong publication record, and experience leading research teams.
Eval GatePost-trainResearchSan Francisco, CANov '259
Tech Lead/Manager, Machine Learning Research Scientist- LLM Evals
Scale AI is seeking a Tech Lead/Manager for their LLM Evals Research team. This role involves leading a team to develop and implement novel evaluation methodologies, metrics, and benchmarks for large language models, focusing on areas like instruction following, factuality, robustness, and fairness. The position requires research into LLM evaluation techniques, communication with clients and internal teams, implementation of scalable evaluation pipelines, and publishing research findings. The ideal candidate has extensive experience in LLMs, NLP, and Transformer modeling, with a proven track record of research impact and team leadership.
Eval GatePost-trainResearchSan Francisco, CAAug '239
SWE Fellow - Human Frontier Collective (Canada)
This role focuses on evaluating and interpreting advanced generative AI systems, designing datasets for rigorous evaluation, and contributing to research publications. It involves collaborating with AI labs and platforms to enhance model accuracy and reasoning, positioning it as a key player in the AI evaluation gate.
Eval GateResearchRemote2w ago8
SWE Fellow - Human Frontier Collective (UK)
This role is for a Software Engineer Fellow focused on evaluating and interpreting advanced generative AI systems within the Human Frontier Collective program. The fellow will design datasets, evaluate AI models, and contribute to research publications, aiming to enhance AI accuracy and reasoning.
Eval GateResearchRemote2w ago8
SWE Fellow - Human Frontier Collective (US)
This role focuses on evaluating advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It's a research-oriented role within a fellowship program aimed at shaping the future of AI.
Eval GateResearchRemote2w ago8
Senior Machine Learning Engineer - Model Evaluations, Public Sector
This role focuses on building and scaling automated evaluation pipelines for AI systems, including LLMs and agentic models, to ensure their reliability, safety, and effectiveness in mission-critical government environments. It involves designing test datasets, benchmarks, and frameworks for various metrics, including LLM-judge evaluations, agent testing, and stress tests.
Eval GateAgentEngineeringWashington, DCNov '258
STEM Fellow - Human Frontier Collective (UK)
This role focuses on evaluating and interpreting advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It's a research-oriented fellowship with a focus on AI evaluation.
Eval GateResearchRemoteOct '258
STEM Fellow - Human Frontier Collective (US)
This role focuses on evaluating and interpreting advanced generative AI systems by designing domain-specific problems and datasets, providing expert insights to enhance model performance, and contributing to research publications. It involves collaboration with AI labs and Scale's research team.
Eval GatePost-trainResearchRemoteJun '258
Product Manager, Public Sector GenAI Test & Evaluation (T&E)
Product Manager for GenAI Test & Evaluation (T&E) in the Public Sector team at Scale AI. This role focuses on defining the vision and roadmap for evaluation capabilities, owning the T&E tech stack to measure and improve agentic applications. Requires strong engineering depth, experience with evaluation systems, problem distillation, ambiguity management, cross-functional leadership, and operational execution. Experience with GenAI implementation, public sector work, and security clearance are preferred.
Eval GateProductWashington, DC3w ago7
Medical Fellow - Human Frontier Collective (UK)
This role involves a medical professional (MD, DO) collaborating with AI labs to evaluate generative AI systems, focusing on safety, accuracy, and reasoning frameworks within healthcare. The goal is to apply clinical expertise to shape AI decision-making and contribute to research publications. The role is a 6-month independent contractor position with potential for extension.
Eval GatePost-trainResearchRemoteOct '257
Medical Fellow - Human Frontier Collective (US)
This role is for a Medical Fellow to collaborate on high-impact projects evaluating and interpreting advanced generative AI systems in healthcare. The fellow will design clinical scenarios, evaluate model safety and accuracy, shape reasoning frameworks, and contribute to research publications. Prior AI experience is not required but is a plus.
Eval GateResearchRemoteJul '257