Apple has 261 active AI-related job listings. The majority of these roles are focused on agents, accounting for 24% of the total, followed by application (22%) and serving infrastructure (21%). Engineering is the primary function for these positions, with the United States being the dominant hiring country. Frequent tech tags include model serving, inference infrastructure, and LLM observability. Over the last 30 days, Apple has posted 111 new AI roles, representing a 61% increase compared to the previous 30-day period.
Currently tracking 171 active AI roles, down 37% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $120k–$487k (avg $235k).
Apple currently has 233 active AI-related roles in our index. The most common open titles are: Machine Learning Engineer (4), AIML - Sr Data Scientist, Evaluation (2), Advanced Manufacturing Engineer(iPhone) - Smart Manufacturing (2), Machine Learning Engineer, Apple Services Engineering (2), Machine Learning Software Engineer (2). Most positions are in Engineering and Research.
Apple's active AI hiring is concentrated in: agents (30%), application (21%), serving infrastructure (14%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Apple is hiring AI talent in: United States (182 roles), China (17 roles), India (10 roles), United Kingdom (7 roles).
Job postings at Apple most frequently mention: Machine Learning, Python, Data Science, Large Language Models (LLMs), Statistics.
In the past 30 days, Apple has posted 80 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Machine Learning Engineer Machine Learning Engineer focused on Evaluation & Insights for the Human-Centered AI team at Apple Media Services. The role involves evaluating and optimizing Foundation Models and generative AI systems, architecting evaluation frameworks, designing MLOps pipelines, and translating failure modes into guardrails and training signals. This position bridges human perception and algorithmic performance, working cross-functionally to ensure AI experiences are reliable, safe, and aligned with human expectations. | Eval GatePost-train | 9 |
| Machine Learning Research Engineer, Siri Speech This role focuses on evaluating, analyzing, and improving state-of-the-art end-to-end speech models for Siri. The engineer will design and implement novel evaluation frameworks, develop tools to measure model performance, analyze model behavior, and explore innovative approaches to advance speech capabilities. The role also involves building automated processes for large-scale model evaluation and analysis, collaborating with cross-functional teams. |
| Eval GatePost-train |
| 9 |
| Machine Learning Engineer Machine Learning Engineer focused on Evaluation & Insights for the Human-Centered AI team. This role involves architecting evaluation frameworks, designing MLOps pipelines for model assessment, and translating qualitative failure modes into programmatic guardrails and training signals for Foundation Models and generative AI systems. The role also involves collaborating with various teams to ensure AI experiences are reliable, safe, and aligned with human expectations. | Eval GatePost-train | 9 |
| Evaluation & Insights Machine Learning Engineer This role focuses on evaluating and improving AI systems by analyzing AI outputs, developing evaluation frameworks, and translating findings into actionable improvements. It involves assessing model behavior, identifying edge cases, and ensuring AI systems are reliable, safe, and aligned with human expectations. The role also involves building MLOps and automation for evaluation pipelines and collaborating with various teams to refine model performance. | Eval GatePost-train | 9 |
| Machine Learning Engineer, ML/GenAI Evaluation Machine Learning Engineer focused on evaluating ML and GenAI models for Wallet, Payments, and Commerce features. This role defines evaluation criteria, metrics frameworks, and quality standards, designs adversarial test strategies, and owns the model quality sign-off process to ensure models meet high standards for accuracy, robustness, fairness, and reliability before shipping to hundreds of millions of users. Responsibilities include building test sets, developing robustness testing methodologies, owning fairness evaluation end-to-end, evaluating generative model outputs, and synthesizing results for product decisions. | Eval Gate | 8 |
| AIML - ML Engineer, Responsible AI ML Engineer focused on Responsible AI, developing models, tools, and metrics for assessing and evaluating the safety, robustness, and uncertainty of generative models (vision and language). This includes interpreting model failures, building human annotation and red teaming pipelines, and prototyping/implementing/evaluating new ML models for red teaming LLMs. | Eval GatePost-train | 8 |
| Applied Machine Learning Engineer - Developer Publications Applied Machine Learning Engineer focused on building and maintaining LLM evaluation pipelines for developer tools at Apple. The role emphasizes MLOps/LLMOps, assessing model quality, tracking regressions, and supporting continuous improvement cycles, requiring strong engineering fundamentals and LLM evaluation experience. | Eval GatePost-train | 8 |
| Staff Applied Scientist, AI Quality & Meta Evaluation Staff Applied Scientist focused on AI Quality & Meta Evaluation, responsible for designing and building the Data Quality Validation framework for LLM Judges. This role involves developing statistical and ML approaches to ensure the trustworthiness of evaluation signals, auditing LLM outputs, and establishing standards for data quality. | Eval GatePost-train | 8 |
| ML Engineer - Automated Evaluation and Adversarial Design ML Engineer focused on building and scaling automated evaluation systems and designing adversarial/stress-testing methodologies for AI-powered features in productivity and creative applications. The role involves assessing AI quality, particularly for multi-turn agentic experiences, and influencing model development decisions through rigorous evaluation. | Eval GateAgent | 8 |
| Senior Applied Scientist - AI Evaluation & Quality Systems Senior Applied Scientist focused on building and scaling AI evaluation and quality systems. The role involves developing methodologies, tooling, and autonomous QA agents to ensure the trustworthiness and quality of AI/ML systems, with a strong emphasis on human-in-the-loop evaluation and anomaly detection. Requires a blend of research and engineering skills to prototype, validate, and ship solutions. | Eval GateAgent | 8 |
| AIML - Sr Machine Learning Engineer, Responsible AI This role focuses on developing, carrying-out, interpreting, and communicating pre- and post-ship evaluations of the safety of Apple Intelligence features, leveraging both human and model-based auto-grading. It also involves researching and developing auto-grading methodology & infrastructure. The role requires creating safety evaluations that uphold Responsible AI values through data sampling, curation, annotation, auto-grading, and analysis. It draws on applied data science, scientific investigation, cross-functional communication, and metrics reporting. | Eval GatePost-train | 8 |
| AI Data Scientist This role focuses on evaluating, optimizing, and analyzing the performance of ML and multi-modal LLMs. The Data Scientist will develop metrics, conduct failure analysis, process data for evaluation, and implement optimization techniques. They will collaborate with cross-functional teams to integrate models and communicate results. The role requires experience with model evaluation, RAG, and LLM prompt evaluation, with preferred experience in multi-modal foundation models and GenAI frameworks. | Eval GatePost-train | 8 |
| AIML - Sr Data Scientist, Evaluation This role focuses on developing and researching evaluation methods to improve the quality of user-facing AI products like Siri and Apple Intelligence. It involves working with large datasets, applying advanced analytical methods including prompt engineering and using LLMs as judges, and partnering with engineering teams to translate methodological developments into production technologies. The goal is to guide product development and decisions through rigorous evaluation and data analysis, ultimately impacting products used by hundreds of millions globally. | Eval GatePost-train | 7 |
| AIML - Sr Data Scientist, Evaluation This role focuses on developing and implementing evaluation methods for AI/ML products, particularly for search quality and user-facing features like Siri and Apple Intelligence. It involves working with large datasets, applying advanced analytical methods including prompt engineering and using LLMs as judges, and partnering with engineering teams to translate methodological developments into production technologies. The role requires strong data science, ML, and analytics skills, with a focus on experimentation and evaluation. | Eval GatePost-train | 7 |
| Data Scientist, AI/ML Model Quality This role focuses on ensuring the quality of data used for training and evaluating AI/ML models, particularly in Generative AI systems within the Wallet, Payments, and Commerce domains. The Data Scientist will build and maintain intelligent systems, validation frameworks, and monitoring pipelines to ensure data integrity and model trustworthiness. Responsibilities include curating ground-truth datasets, auditing training data for bias, defining data quality metrics, integrating automated checks, and analyzing telemetry for GenAI workflows to identify failure modes and provide recommendations. | Eval GateData | 7 |
| Systems Engineer - Evaluation Engineering Systems Engineer focused on building and scaling the infrastructure for an AI Agentic Evaluation Platform. This involves designing distributed execution engines, internal developer platforms, backend APIs, stream processing, and deployment topologies for large-scale agent simulations and LLM-as-a-judge pipelines. The role emphasizes reliability, observability, and guardrails for complex AI systems. | Eval GateAgent | 7 |
| Sr. Software Engineer: Agentic Evaluation This role focuses on building and maintaining the infrastructure, tooling, and pipelines for evaluating Siri, Apple's AI assistant, at scale. The engineer will extend evaluation capabilities to new platforms, support new features, diagnose failures, and contribute to architecture decisions for evaluation systems. Experience with evaluating ML, LLM, or agent-based systems is preferred. | Eval GateAgent | 7 |
| Senior Software Engineer - Siri Agentic Evaluation Platform The role involves building software platforms and tools for evaluating Siri's quality and effectiveness using agentic technology. The primary focus is on creating evaluation platforms that provide feedback signals throughout the software lifecycle, with a secondary focus on the agentic technology itself. | Eval GateAgent | 7 |
| Automation and Triage Engineer, Siri This role focuses on building and maintaining automated test suites and evaluation frameworks for Siri, ensuring its AI quality and performance across various Apple platforms. It involves investigating complex failures in Siri's AI pipeline, distinguishing regressions, and partnering with engineering and ML teams to define and track quality metrics. The role requires strong software engineering skills, experience with agentic systems and LLM evaluation, and familiarity with on-device AI and conversational systems. | Eval GateAgent | 7 |
| Software Engineer, Agentic Evaluation Software Engineer at Apple focused on building and evaluating AI-powered experiences for Siri and other products. The role involves end-to-end software development, from prototyping to production systems, with a strong emphasis on quality measurement and evaluation frameworks. Experience with generative AI for coding is required. | Eval GateAgent | 7 |
| Annotation Data Scientist, Evaluation Integrity (Siri) This role focuses on designing and managing human-in-the-loop (HITL) annotation tasks to evaluate agentic systems, specifically for Siri. The primary goal is to create a trusted quality signal by turning human judgment into a rigorous, reproducible metric. Responsibilities include designing annotation tasks, authoring guidelines, managing annotation programs, developing custom tooling, applying data science to analyze human-labeled data, and contributing to overall evaluation health reporting. The role sits at the intersection of data science, human annotation engineering, and evaluation methodology. | Eval GateAgent | 7 |
| ML Engineer - Evaluation Analysis, Metric and Data Strategy ML Engineer focused on defining and analyzing quality metrics for AI-powered features in consumer productivity and creative applications. This role is critical for informing model development, feature launches, and product strategy by translating evaluation data and user behavior into actionable insights. It involves designing metrics frameworks, auditing data representativeness, and developing evaluation methods for complex, agentic AI experiences. | Eval GateAgent | 7 |
| Siri, Eval Architect Engineer The role focuses on defining the architecture for systems that measure Siri's quality across platforms and model updates. It involves building evaluation infrastructure for large-scale automation, simulation, AI-powered auto-evaluators, and agentic fix pipelines. The Eval Systems Architect will own the technical vision and system architecture for Siri's evaluation stack, ensuring coherence, scalability, and trustworthiness, and will influence the technical roadmap for the evaluation platform. | Eval GateAgent | 7 |
| Test Triage & Automation Engineer, Siri This role focuses on designing, driving, and triaging automation pipelines and evaluation frameworks for Siri's AI features. The engineer will analyze large-scale test data, identify trends, and develop strategies to improve the efficiency and effectiveness of quality engineering processes. The goal is to ensure the qualitative experience of Siri's AI features meets high standards and to influence product decisions and model improvements. | Eval GateAgent | 7 |
| Quality Engineer - Machine Learning Quality Engineer for Machine Learning in Apple's Creative Music Apps team, focusing on testing ML models and DSP algorithms for audio features on macOS, iOS & iPadOS. Responsibilities include stress-testing for regressions, designing test strategies, developing automated tests, and collaborating with ML engineers on quality metrics. | Eval GatePost-train | 7 |
| AIML - Machine Learning Engineer - Computer Vision & Audio, MIND Machine Learning Engineer focused on the data and evaluation lifecycle for production models in computer vision and audio. Responsibilities include scaling data pipelines, ensuring data quality, performing failure analysis, implementing data augmentation, and designing evaluation metrics for models. The role bridges hardware, software, and modeling for efficient inference. | Eval GateData | 7 |
| AIML - Software Engineer - AI, Evaluation Software Engineer role focused on building tools and systems for the automatic evaluation of Apple's AI products, specifically using LLM-as-judge and related technologies to improve the quality and efficiency of these evaluations. The role involves designing and developing frameworks, pipelines, and tools for AI model development, deployment, and measurement, directly impacting product launch decisions. | Eval GateAgent | 7 |
| Applications of ML Engineering Manager Manager for Responsible Development & Safety in Apple Services Engineering, focusing on shaping policies, evaluating AI models and applications, and ensuring safe deployment of user-facing features. The role involves leading a team, collaborating with various cross-functional teams, and developing evaluation processes for AI/ML models. | Eval GatePost-train | 7 |
| AIML - Data Scientist, Evaluation This role focuses on designing and implementing evaluation frameworks for AI/ML systems, specifically for Apple's consumer-facing products. The Data Scientist will work with large datasets, develop methodologies for assessing product quality, and partner with engineering teams to improve user experience and guide feature development. The role involves building evaluation datasets, human-in-the-loop systems, and translating insights into actionable recommendations. | Eval Gate | 7 |
| Data Scientist, Maps Evaluation Data Scientist focused on the deep evaluation of Apple Maps search services, features, monetization initiatives, and Apple Business. This role involves defining success metrics, evaluating product performance, understanding user behavior, and driving data-informed decisions through experiment design, A/B testing, funnel analysis, exploratory data analysis, AI/ML modeling, and data mining. | Eval Gate | 5 |
| Evaluation Reliability SRE This role focuses on the reliability and operational excellence of ML evaluation infrastructure, specifically the production backbone for Siri's quality signal. It involves managing resources, orchestration, on-call response, and observability systems to ensure the trustworthiness of evaluation infrastructure. The role requires hands-on experience in site reliability, infrastructure engineering, and operating production systems, with a focus on proactive reliability work and incident response. | Eval Gate | 5 |
| Software Development Engineer - Test, Graphics, Games & ML Software Development Engineer - Test role focused on ensuring the quality of on-device machine learning technologies at Apple. The role involves developing infrastructure, automation, and services for validation and qualification, maintaining CI/CD pipelines, and collaborating with various teams across hardware, software, and product development. Experience with ML frameworks is preferred. | Eval Gate | 5 |
| AIML - Sr Data Scientist, Evaluation This role focuses on developing and implementing evaluation methods for Siri's user-facing products, using data science and machine learning to guide product development and improve search quality. The primary focus is on evaluation and measurement, with collaboration on core ML algorithms. | Eval Gate | 5 |