Apple has 261 active AI-related job listings. The majority of these roles are focused on agents, accounting for 24% of the total, followed by application (22%) and serving infrastructure (21%). Engineering is the primary function for these positions, with the United States being the dominant hiring country. Frequent tech tags include model serving, inference infrastructure, and LLM observability. Over the last 30 days, Apple has posted 111 new AI roles, representing a 61% increase compared to the previous 30-day period.
Currently tracking 171 active AI roles, down 37% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $120k–$487k (avg $235k).
Apple currently has 233 active AI-related roles in our index. The most common open titles are: Machine Learning Engineer (4), AIML - Sr Data Scientist, Evaluation (2), Advanced Manufacturing Engineer(iPhone) - Smart Manufacturing (2), Machine Learning Engineer, Apple Services Engineering (2), Machine Learning Software Engineer (2). Most positions are in Engineering and Research.
Apple's active AI hiring is concentrated in: agents (30%), application (21%), serving infrastructure (14%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Apple is hiring AI talent in: United States (182 roles), China (17 roles), India (10 roles), United Kingdom (7 roles).
Job postings at Apple most frequently mention: Machine Learning, Python, Data Science, Large Language Models (LLMs), Statistics.
In the past 30 days, Apple has posted 80 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Evaluation & Insights Machine Learning Engineer This role focuses on evaluating and improving AI systems by analyzing AI outputs, developing evaluation frameworks, and translating findings into actionable improvements. It involves assessing model behavior, identifying edge cases, and ensuring AI systems are reliable, safe, and aligned with human expectations. The role also involves building MLOps and automation for evaluation pipelines and collaborating with various teams to refine model performance. | Eval GatePost-train | 9 |
| Machine Learning Engineer, ML/GenAI Evaluation Machine Learning Engineer focused on evaluating ML and GenAI models for Wallet, Payments, and Commerce features. This role defines evaluation criteria, metrics frameworks, and quality standards, designs adversarial test strategies, and owns the model quality sign-off process to ensure models meet high standards for accuracy, robustness, fairness, and reliability before shipping to hundreds of millions of users. Responsibilities include building test sets, developing robustness testing methodologies, owning fairness evaluation end-to-end, evaluating generative model outputs, and synthesizing results for product decisions. |
| Eval Gate |
| 8 |
| AIML - ML Engineer, Responsible AI ML Engineer focused on Responsible AI, developing models, tools, and metrics for assessing and evaluating the safety, robustness, and uncertainty of generative models (vision and language). This includes interpreting model failures, building human annotation and red teaming pipelines, and prototyping/implementing/evaluating new ML models for red teaming LLMs. | Eval GatePost-train | 8 |
| Staff Applied Scientist, AI Quality & Meta Evaluation Staff Applied Scientist focused on AI Quality & Meta Evaluation, responsible for designing and building the Data Quality Validation framework for LLM Judges. This role involves developing statistical and ML approaches to ensure the trustworthiness of evaluation signals, auditing LLM outputs, and establishing standards for data quality. | Eval GatePost-train | 8 |
| ML Engineer - Automated Evaluation and Adversarial Design ML Engineer focused on building and scaling automated evaluation systems and designing adversarial/stress-testing methodologies for AI-powered features in productivity and creative applications. The role involves assessing AI quality, particularly for multi-turn agentic experiences, and influencing model development decisions through rigorous evaluation. | Eval GateAgent | 8 |
| Senior Applied Scientist - AI Evaluation & Quality Systems Senior Applied Scientist focused on building and scaling AI evaluation and quality systems. The role involves developing methodologies, tooling, and autonomous QA agents to ensure the trustworthiness and quality of AI/ML systems, with a strong emphasis on human-in-the-loop evaluation and anomaly detection. Requires a blend of research and engineering skills to prototype, validate, and ship solutions. | Eval GateAgent | 8 |
| AIML - Sr Machine Learning Engineer, Responsible AI This role focuses on developing, carrying-out, interpreting, and communicating pre- and post-ship evaluations of the safety of Apple Intelligence features, leveraging both human and model-based auto-grading. It also involves researching and developing auto-grading methodology & infrastructure. The role requires creating safety evaluations that uphold Responsible AI values through data sampling, curation, annotation, auto-grading, and analysis. It draws on applied data science, scientific investigation, cross-functional communication, and metrics reporting. | Eval GatePost-train | 8 |
| AIML - Sr Data Scientist, Evaluation This role focuses on developing and researching evaluation methods to improve the quality of user-facing AI products like Siri and Apple Intelligence. It involves working with large datasets, applying advanced analytical methods including prompt engineering and using LLMs as judges, and partnering with engineering teams to translate methodological developments into production technologies. The goal is to guide product development and decisions through rigorous evaluation and data analysis, ultimately impacting products used by hundreds of millions globally. | Eval GatePost-train | 7 |
| AIML - Sr Data Scientist, Evaluation This role focuses on developing and implementing evaluation methods for AI/ML products, particularly for search quality and user-facing features like Siri and Apple Intelligence. It involves working with large datasets, applying advanced analytical methods including prompt engineering and using LLMs as judges, and partnering with engineering teams to translate methodological developments into production technologies. The role requires strong data science, ML, and analytics skills, with a focus on experimentation and evaluation. | Eval GatePost-train | 7 |
| Data Scientist, AI/ML Model Quality This role focuses on ensuring the quality of data used for training and evaluating AI/ML models, particularly in Generative AI systems within the Wallet, Payments, and Commerce domains. The Data Scientist will build and maintain intelligent systems, validation frameworks, and monitoring pipelines to ensure data integrity and model trustworthiness. Responsibilities include curating ground-truth datasets, auditing training data for bias, defining data quality metrics, integrating automated checks, and analyzing telemetry for GenAI workflows to identify failure modes and provide recommendations. | Eval GateData | 7 |
| Systems Engineer - Evaluation Engineering Systems Engineer focused on building and scaling the infrastructure for an AI Agentic Evaluation Platform. This involves designing distributed execution engines, internal developer platforms, backend APIs, stream processing, and deployment topologies for large-scale agent simulations and LLM-as-a-judge pipelines. The role emphasizes reliability, observability, and guardrails for complex AI systems. | Eval GateAgent | 7 |
| Sr. Software Engineer: Agentic Evaluation This role focuses on building and maintaining the infrastructure, tooling, and pipelines for evaluating Siri, Apple's AI assistant, at scale. The engineer will extend evaluation capabilities to new platforms, support new features, diagnose failures, and contribute to architecture decisions for evaluation systems. Experience with evaluating ML, LLM, or agent-based systems is preferred. | Eval GateAgent | 7 |
| Automation and Triage Engineer, Siri This role focuses on building and maintaining automated test suites and evaluation frameworks for Siri, ensuring its AI quality and performance across various Apple platforms. It involves investigating complex failures in Siri's AI pipeline, distinguishing regressions, and partnering with engineering and ML teams to define and track quality metrics. The role requires strong software engineering skills, experience with agentic systems and LLM evaluation, and familiarity with on-device AI and conversational systems. | Eval GateAgent | 7 |
| Software Engineer, Agentic Evaluation Software Engineer at Apple focused on building and evaluating AI-powered experiences for Siri and other products. The role involves end-to-end software development, from prototyping to production systems, with a strong emphasis on quality measurement and evaluation frameworks. Experience with generative AI for coding is required. | Eval GateAgent | 7 |
| Annotation Data Scientist, Evaluation Integrity (Siri) This role focuses on designing and managing human-in-the-loop (HITL) annotation tasks to evaluate agentic systems, specifically for Siri. The primary goal is to create a trusted quality signal by turning human judgment into a rigorous, reproducible metric. Responsibilities include designing annotation tasks, authoring guidelines, managing annotation programs, developing custom tooling, applying data science to analyze human-labeled data, and contributing to overall evaluation health reporting. The role sits at the intersection of data science, human annotation engineering, and evaluation methodology. | Eval GateAgent | 7 |
| ML Engineer - Evaluation Analysis, Metric and Data Strategy ML Engineer focused on defining and analyzing quality metrics for AI-powered features in consumer productivity and creative applications. This role is critical for informing model development, feature launches, and product strategy by translating evaluation data and user behavior into actionable insights. It involves designing metrics frameworks, auditing data representativeness, and developing evaluation methods for complex, agentic AI experiences. | Eval GateAgent | 7 |
| Siri, Eval Architect Engineer The role focuses on defining the architecture for systems that measure Siri's quality across platforms and model updates. It involves building evaluation infrastructure for large-scale automation, simulation, AI-powered auto-evaluators, and agentic fix pipelines. The Eval Systems Architect will own the technical vision and system architecture for Siri's evaluation stack, ensuring coherence, scalability, and trustworthiness, and will influence the technical roadmap for the evaluation platform. | Eval GateAgent | 7 |
| Test Triage & Automation Engineer, Siri This role focuses on designing, driving, and triaging automation pipelines and evaluation frameworks for Siri's AI features. The engineer will analyze large-scale test data, identify trends, and develop strategies to improve the efficiency and effectiveness of quality engineering processes. The goal is to ensure the qualitative experience of Siri's AI features meets high standards and to influence product decisions and model improvements. | Eval GateAgent | 7 |
| AIML - Machine Learning Engineer - Computer Vision & Audio, MIND Machine Learning Engineer focused on the data and evaluation lifecycle for production models in computer vision and audio. Responsibilities include scaling data pipelines, ensuring data quality, performing failure analysis, implementing data augmentation, and designing evaluation metrics for models. The role bridges hardware, software, and modeling for efficient inference. | Eval GateData | 7 |
| AIML - Software Engineer - AI, Evaluation Software Engineer role focused on building tools and systems for the automatic evaluation of Apple's AI products, specifically using LLM-as-judge and related technologies to improve the quality and efficiency of these evaluations. The role involves designing and developing frameworks, pipelines, and tools for AI model development, deployment, and measurement, directly impacting product launch decisions. | Eval GateAgent | 7 |
| Applications of ML Engineering Manager Manager for Responsible Development & Safety in Apple Services Engineering, focusing on shaping policies, evaluating AI models and applications, and ensuring safe deployment of user-facing features. The role involves leading a team, collaborating with various cross-functional teams, and developing evaluation processes for AI/ML models. | Eval GatePost-train | 7 |
| AIML - Data Scientist, Evaluation This role focuses on designing and implementing evaluation frameworks for AI/ML systems, specifically for Apple's consumer-facing products. The Data Scientist will work with large datasets, develop methodologies for assessing product quality, and partner with engineering teams to improve user experience and guide feature development. The role involves building evaluation datasets, human-in-the-loop systems, and translating insights into actionable recommendations. | Eval Gate | 7 |