Currently tracking 1110 active AI roles, down 16% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $65k–$465k (avg $194k).
Amazon has 1472 active AI-related job listings. The company is heavily focused on roles within the "agents" stage, which accounts for 38% of its AI hiring, followed by "application" at 26%. Engineering is the dominant function, with 1172 positions. Over the last 30 days, Amazon has added 667 new AI roles, representing a 74% increase compared to the previous 30-day period. Frequent tech tags include agent_orchestration, model_serving, and multimodal.
Amazon currently has 1573 active AI-related roles in our index. The most common open titles are: ML Data Associate-II (9), 2026 Applied Scientist Intern, Amazon University Talent Acquisition (8), AI Data Associate (Dutch) , Artificial General Intelligence Data Services (8), Software Development Engineer, AWS (8), Senior Delivery Consultant - Data , Professional Services, AWSI HCLS (7). Most positions are in Engineering and Research.
Amazon's active AI hiring is concentrated in: agents (41%), application (26%), serving infrastructure (13%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Amazon is hiring AI talent in: United States (1023 roles), Canada (59 roles), United Kingdom (47 roles), India (23 roles).
Job postings at Amazon most frequently mention: Machine Learning, Generative AI, Large Language Models (LLMs), Software Engineering, Agentic Systems.
In the past 30 days, Amazon has posted 696 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Senior Applied Scientist, Amazon AWS Agentic AI, AWS AI Fundamental Research This role focuses on leading the design and development of agentic evaluation frameworks and training evaluation/critic models to assess the quality and effectiveness of AI agents. The scientist will define methodologies, create benchmarks, build automated systems, and conduct research to advance agent and evaluation science. The role involves end-to-end ownership from research to production deployment, collaborating with engineering to deliver these capabilities as managed AWS services. It also includes mentoring junior scientists and contributing to the research community. | Eval GatePost-train | 9 |
| AI Language Engineer, Alexa for Shopping AI Language Engineer for Amazon's Conversational Shopping team, focusing on developing and implementing LLM-assisted evaluation tools and processes to improve AI-driven shopping experiences. The role involves creating automated verification scripts, annotation guidelines, and quality metrics, collaborating with cross-functional teams to ensure high-quality editorial data and product outcomes. |
| Eval GateData |
| 8 |
| Senior Applied Scientist, Fauna This role focuses on developing evaluation frameworks and data collection protocols for robotic capabilities, operating at the intersection of robotics, machine learning, and human-in-the-loop systems. The scientist will design how to measure, stress-test, and improve robot behavior, build infrastructure connecting teleoperation, evaluation, and learning, and lead technical projects. | Eval GateAgent | 8 |
| Senior Applied Scientist, Fauna This role focuses on developing evaluation frameworks and data collection protocols for robotic capabilities, operating at the intersection of robotics, machine learning, and human-in-the-loop systems. The scientist will design how to measure, stress-test, and improve robot behavior, build infrastructure connecting teleoperation, evaluation, and learning, and lead technical projects. | Eval GateAgent | 8 |
| AI Principal Product Manager-Technical, Alexa Responsible AI The AI Principal PMT for Alexa Responsible AI will define the standard for how Alexa earns and keeps customer trust. This role owns the product discipline of Responsible AI, defining customer experiences for safety guardrails, trust signals, and evaluation frameworks. The PMT will set product vision and strategy, lead cross-functional alignment across Applied Science, Engineering, Legal, Policy, and UX, and ensure the full responsible product experience including safety, privacy, and security. The role requires technical depth in LLMs and AI safety, understanding how models fail and writing requirements for safety model development and evaluation system design. The PMT will also mentor other PMs and influence Responsible AI scaling across Alexa. | Eval GatePost-train | 8 |
| Data Scientist, Network Fabric Engineering Data Scientist role focused on defining and driving the data science strategy for network operations automation, including agentic systems. The role involves defining metrics, building risk and reliability models, and evaluating the performance of automation and AI systems to improve network availability. It emphasizes statistical rigor and evidence-based decision-making within a team of network and software engineers. | Eval GateAgent | 7 |
| Applied Scientist, Silicon and Systems Group Edge AI Research Scientist role focused on developing novel evaluation methods for multimodal language models and agents for consumer devices. This involves creating and validating automated evaluation techniques, analyzing datasets to understand model gaps, and collaborating with training teams. The role emphasizes hardware-software integration for efficient model training and deployment on edge devices. | Eval GatePost-train | 7 |
| Applied Scientist, Fauna This role focuses on developing evaluation frameworks and data collection protocols for robotic capabilities, bridging robotics, ML, and human-in-the-loop systems. The scientist will design evaluation methodologies, create data collection protocols, build teleoperation workflows, and analyze results to improve robot behavior and dataset generation. | Eval GateData | 7 |
| Applied SCI III - AMZ007408, AWS Science of Security This role focuses on the design, development, evaluation, and deployment of formal reasoning systems for security, privacy, and data protection in cloud environments. It involves applying formal verification and automated theorem proving, leading research in AI security, evaluating threats to Generative AI, and developing safeguards. The role also requires building and implementing scalable software solutions for AI systems, data privacy, security, or automated reasoning, with experience in compiler development, static program analysis, or formal/symbolic AI systems. | Eval GateAgent | 7 |
| Software Development Engineer Test, Alexa Global quality Software Development Engineer in Test focused on quality assurance automation and framework creation for Alexa's global, multilingual, and multimodal experiences. The role involves building agentic automation tooling for end-to-end quality evaluation, including synthetic test generation, LLM-as-a-Judge, and visual/cultural validation using ML. | Eval GateAgent | 7 |
| Manager, Program Management, Alexa Sensitive Content Intelligence (ASCI) Manager, Program Management for Alexa Sensitive Content Intelligence (ASCI) team, focusing on shaping how Alexa protects customers from harmful content using generative AI and responsible AI guardrails. The role involves strategic leadership, cross-functional program delivery, and team building, with a strong emphasis on data and LLM fluency, defining and executing roadmaps for responsible AI, and ensuring program execution through metrics and mechanisms. | Eval GateAgent | 7 |
| Applied Scientist, AWS Automated Reasoning Applied Scientist role focused on automated reasoning, privacy, and sovereignty within AWS. The role involves solving complex problems, designing and implementing solutions, and providing cross-organizational technical influence. Requires a PhD or Master's with significant applied research experience in areas like SAT, SMT, theorem proving, symbolic simulation, program analysis, or type systems. Experience with specific programming languages like O'Caml, Dafny, Haskell, Lean, or Rust is preferred. | Eval Gate | 7 |
| Applied Scientist II, Amazon, Amazon This role focuses on designing and developing evaluation frameworks for LLMs in e-commerce, curating training data to improve model performance, and potentially applying reinforcement learning. It involves working with scientists and engineers to innovate on customer-facing shopping experiences. | Eval GatePost-train | 7 |
| AI Benchmarking Specialist - Chinese, International Seller Growth This role focuses on evaluating AI systems, specifically LLMs, by designing and executing benchmarking and audit activities to assess model quality, compliance, robustness, and fairness. It involves annotation for training, measuring, and improving AI models, preparing audit reports, and ensuring data quality. The role supports the Seller AI team in developing Gen-AI/LLM powered tools for sellers. | Eval Gate | 5 |
| Sr.Quality Assurance Engineer (European Language Expert), Alexa Global Quality Team This role focuses on ensuring the quality of Alexa's multilingual experience across text, voice, and visual modalities. It involves architecting AI-powered evaluation frameworks, building agentic automation tooling for test generation and evaluation, and driving quality metrics for a global customer base. | Eval GateAgent | 5 |
| Sr.Quality Assurance Engineer (European Language Expert), Alexa Global Quality Team This role focuses on ensuring the quality of Alexa+'s multilingual experience by driving quality across various modalities (text, voice, visual). It involves architecting AI-powered evaluation frameworks, building agentic automation tooling for test generation and evaluation, and performing manual and automated testing. The role requires fluency in one or more European languages and emphasizes quality across speech, visual, conversational, cultural, and linguistic dimensions. | Eval GateAgent | 5 |
| Software Development Engineer in Test, Alexa Global Quality - India This role focuses on building and evolving agentic automation tooling for end-to-end quality evaluation of Alexa's multi-locale experience. It involves creating scalable, AI-powered solutions for validation across speech, visual, conversational quality, and cultural/linguistic dimensions, including synthetic test generation and LLM-as-a-Judge evaluations. | Eval GateAgent | 5 |