AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown

Currently tracking 995 active AI roles, up 64% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $65k–$465k (avg $196k).

Hiring
995 / 995
Momentum (4w)
↑+403 +64%
1033 opens last 4w · 630 prior 4w
Salary range · avg $196k
$65k–$465k
USD · disclosed roles only
Tracked since
Oct '24
last role today
Hiring velocityscroll left for older weeks
2 new roles
Oct 7
1 new role
Feb 3
1 new role
Mar 10
1 new role
17
1 new role
24
2 new roles
31
1 new role
Apr 14
4 new roles
28
2 new roles
May 12
1 new role
19
1 new role
26
3 new roles
Jun 2
1 new role
9
4 new roles
16
2 new roles
23
2 new roles
30
2 new roles
Jul 14
12 new roles
21
3 new roles
28
4 new roles
Aug 4
5 new roles
11
2 new roles
18
3 new roles
25
11 new roles
Sep 1
4 new roles
8
9 new roles
15
4 new roles
22
8 new roles
29
7 new roles
Oct 6
9 new roles
13
8 new roles
20
14 new roles
27
13 new roles
Nov 3
20 new roles
10
14 new roles
17
20 new roles
24
21 new roles
Dec 1
14 new roles
8
19 new roles
15
12 new roles
22
8 new roles
29
29 new roles
Jan 5
22 new roles
12
25 new roles
19
67 new roles
26
64 new roles
Feb 2
71 new roles
9
52 new roles
16
80 new roles
23
110 new roles
Mar 2
135 new roles
9
129 new roles
16
136 new roles
23
136 new roles
30
164 new roles
Apr 6
194 new roles
13
251 new roles
20
237 new roles
27
304 new roles
May 4
241 new roles
11

Jobs (20)

995 AI · 2722 total active
FilteredStageEval Gate×FunctionEngineering×Clear all
Show
Active onlyAI only (≥ 7)
Stage
AllData · 53Pretrain · 9Post-train · 93Serve · 124Agent · 437Eval Gate · 25Ship · 254
Function
AllEngineering · 778Research · 175Product · 42
Country
AllUnited States · 653Canada · 48United Kingdom · 18India · 17Spain · 13Australia · 11Romania · 7Belgium · 6Germany · 6Poland · 6Taiwan · 6China · 5Japan · 5Singapore · 5Brazil · 4Mexico · 4France · 3Netherlands · 3Switzerland · 3Philippines · 2Vietnam · 2Egypt · 1Estonia · 1Italy · 1South Korea · 1Sweden · 1Thailand · 1
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Applied Science Manager, Sponsored Products and Brands
Manager for a Continuous Model Evaluation and Learning workstream within Amazon Ads' Sponsored Products and Brands team. The role involves leading a team of applied scientists and engineers to build and ship an evaluation and remediation framework for an agentic brand-intelligence system. This includes designing evaluation metrics, developing optimization engines for prompts and synthetic data, and ensuring offline-to-online consistency for quality improvements. The goal is to enable autonomous detect-diagnose-remediate loops to scale quality across brand skills.
Eval GateAgentEngineeringNY +11w ago8
Data Scientist, AWS Quick Data
The Data Scientist will focus on developing evaluation and benchmarking datasets for generative AI capabilities within the Amazon Quick Suite enterprise AI platform. This includes leveraging LLMs for synthetic data generation, creating ground truth datasets, leading human annotation initiatives, and contributing to Responsible AI efforts to ensure enterprise-readiness, safety, and effectiveness of AI at scale.
Eval GateData
Engineering
Santa Clara, CA
1w ago
8
Data Scientist, AWS Quick Data
The Data Scientist II will focus on developing evaluation and benchmarking datasets for enterprise AI features, specifically for Amazon Quick Suite. This involves leveraging Generative AI techniques, LLMs for synthetic data generation, and LLM-as-a-judge settings to assess model performance, ensure data quality, and contribute to Responsible AI initiatives. The role also includes building scalable data pipelines and tools for continuous evaluation.
Eval GateDataEngineeringSanta Clara, CA4w ago8
Data Scientist, AWS Quick Data
The Data Scientist will focus on developing evaluation and benchmarking datasets for generative AI capabilities within the Amazon Quick Suite enterprise AI platform. This includes leveraging LLMs for synthetic data generation, creating ground truth datasets, leading human annotation initiatives, and contributing to Responsible AI efforts to ensure enterprise-readiness, safety, and effectiveness of AI at scale.
Eval GateDataEngineeringSanta Clara, CA6w ago8
Sr. Software Development Engineer, Automated Reasoning Group
Senior Software Development Engineer role focused on applying Automated Reasoning to verify Generative AI outputs, specifically addressing hallucinations within AWS services. The role involves designing and building new services and capabilities at scale, contributing to the evolution of the Automated Reasoning Checks (ARc) service, and making automated reasoning more accessible within AWS.
Eval GateEngineeringNY +12w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad3w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad3w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad3w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad3w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
Senior Applied Scientist, Fauna
Senior Applied Scientist role focused on developing evaluation frameworks and data collection protocols for robotic capabilities. The role involves designing how to measure, stress-test, and improve robot behavior, building infrastructure for teleoperation, evaluation, and learning, and analyzing results to identify performance gaps. It requires expertise in robotics, ML, and human-in-the-loop systems, with a focus on turning capability goals into measurable evaluation systems.
Eval GateAgentEngineeringNY +15w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
This role focuses on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. The primary responsibilities involve benchmarking AI models, evaluating audits performed by a core auditing team, improving audit consistency, and enforcing quality standards. The goal is to scale AI model evaluation coverage and ensure high-quality outcomes for sellers.
Eval GateEngineeringIN, TS, Hyderabad5w ago7
AI Benchmarking Specialist, SP Support - German, International Seller Growth
This role focuses on evaluating AI systems, specifically LLMs, by designing and executing benchmarking and audit activities. The core responsibilities include assessing model quality, compliance, robustness, and fairness, as well as handling annotations for training and measuring AI models. The role also involves preparing audit reports and ensuring data quality.
Eval GateDataEngineeringIN, KA, Bengaluru5w ago7
AI Benchmarking Lead, Performance Benchmarking Evaluation
The AI Benchmarking Lead will focus on ensuring the quality and reliability of AI model evaluations for Amazon's Seller Assistant copilot. This role involves benchmarking AI models, evaluating audit processes, improving audit consistency, and enforcing quality standards to support the scaling of the product to a wider seller base.
Eval GateEngineeringIN, TS, Hyderabad6w ago7
Software Development Manager, Agentic AI - AgentCore
This role is for a Software Development Manager on the Agentic AI organization's Evaluations & Optimization team at AWS. The manager will lead a team of engineers to build systems for assessing the quality, performance, and reliability of GenAI and agentic systems, as well as optimization solutions. The work involves deep learning, distributed systems, and evaluation science, focusing on building infrastructure and tooling for evaluation workflows.
Eval GateAgentEngineeringSeattle, WA6w ago7
Applied Science Manager, Artificial General Intelligence , Quality Automation
Applied Science Manager for AGI team focusing on quality automation, auditing, and evaluation of LLMs and multimodal systems. Leads a team of scientists to develop quality strategies, auditing frameworks, and research new methodologies to ensure data integrity and model performance. Manages team development, cross-functional communication, and drives research into data impact and utility measurement for AI models.
Eval GatePost-trainEngineeringBellevue, WA8w ago7