Applied AI Engineer - Icloud Data

Apple Apple · Big Tech · Cupertino, CA +1 · Software and Services

This role focuses on building and scaling AI-native capabilities within Apple's iCloud data platform. The engineer will design, build, and own end-to-end AI systems, including agents, retrieval mechanisms, evaluation, guardrails, and observability. Key responsibilities involve optimizing AI systems for cost and performance, exploring state-of-the-art AI techniques, and educating the broader organization on AI patterns. The role requires experience in taking LLM or agentic systems from prototype to production, proficiency in modern AI frameworks, and a strong understanding of ML/DL principles.

What you'd actually do

  1. Build the AI foundation of our data platform, scalable and trustworthy AI products, agents and workflows that power self-serve analytics, experimentation, and data engineering across iCloud, in partnership with Engineering, Data Science, Product, Platform and Research, improving how we build, operate, and scale data for billions of users worldwide.
  2. Design, build and own AI systems end-to-end, from retrieval, planning and reasoning, through evaluation, guardrails and observability, to deployment and the on-call rotation that keeps them trustworthy.
  3. Drive cost, performance and inference-quality efficiency across our AI systems, making thoughtful model selection and serving decisions, optimizing latency, throughput and token economics, and introducing techniques (caching, batching, distillation, quantization, speculative decoding) that let us scale AI capabilities sustainably at Apple scale.
  4. Build deep domain expertise across our data and AI stack, product and business, and be an advocate for engineering excellence and responsible AI.
  5. Explore and introduce state-of-the-art AI techniques, models, agentic patterns, evaluation methods, and AI-native developer tools, translating them into capabilities like natural-language data interfaces, AI-accelerated pipeline development, and intelligent alerting that make Data Engineering and Data Science teams materially faster.

Skills

Required

  • 8+ years of software engineering experience building scalable systems, reusable tools and frameworks
  • 3+ years taking LLM or agentic systems from prototype to production
  • Deep fluency in the modern AI stack
  • Architect, build and operate production-grade AI products composed of LLMs, foundation models, agents and deterministic components
  • Clear judgment on inference-versus-compute boundaries, task decomposition across specialized models, orchestration of multi-step reasoning and tool use, and graceful degradation under failure
  • Solid foundation in machine learning and deep learning
  • Understanding of how modern models (transformers, LLMs) are trained, fine-tuned and evaluated
  • Reason about embeddings, loss functions and statistical rigor
  • Diagnose production issues related to prompt, retrieval, model or data
  • Proficiency in at least one high-level language (Python, Scala, Java, or Go)
  • Discipline to write readable, observable, and testable code
  • Hands-on fluency with modern LLM and agent frameworks (LangChain, LlamaIndex, Semantic Kernel, Google ADK or equivalent)
  • Hands-on fluency with vector databases (FAISS, Chroma or similar)
  • Hands-on fluency with agentic architectures, multi-agent coordination, tool invocation and stateful reasoning
  • Understanding of when to use planning, reranking, structured reasoning, fine-tuning or deterministic compute
  • Production discipline for AI systems: evaluation harnesses, guardrails and telemetry
  • Optimization for cost, latency, throughput and inference quality
  • Experience with the data infrastructure ecosystem
  • Experience with SQL engines (such as Trino, Presto or Spark)
  • Experience with lakehouse architectures, workflow orchestration, and streaming systems
  • Ability to build AI capabilities that sit natively on top of data infrastructure
  • Strategic product mindset paired with a research sensibility
  • Ability to tackle loosely defined problems with meticulous attention to detail
  • Ability to drive ambiguous projects to completion
  • Clear communication across cross-functional teams to influence product strategy
  • Ability to evangelize AI engineering practices through workshops, technical playbooks, design guidance, and mentorship

Nice to have

  • MS or BS in Computer Science, Artificial Intelligence, Machine Learning, Engineering, Mathematics, Statistics or a related field OR equivalent practical experience building AI systems in production

What the JD emphasized

  • 8+ years of software engineering experience building scalable systems, reusable tools and frameworks, with 3+ years taking LLM or agentic systems from prototype to production, and deep fluency in the modern AI stack.
  • You architect, build and operate production-grade AI products composed of LLMs, foundation models, agents and deterministic components, for both human and machine consumption, with clear judgment on inference-versus-compute boundaries, task decomposition across specialized models, orchestration of multi-step reasoning and tool use, and graceful degradation under failure.
  • Solid foundation in machine learning and deep learning. You understand how modern models (transformers, LLMs) are trained, fine-tuned and evaluated, reason about embeddings, loss functions and statistical rigor, and can diagnose whether a production issue is prompt, retrieval, model or data.
  • Hands-on fluency with modern LLM and agent frameworks (LangChain, LlamaIndex, Semantic Kernel, Google ADK or equivalent), vector databases (FAISS, Chroma or similar), and agentic architectures, multi-agent coordination, tool invocation and stateful reasoning. You've moved beyond vanilla RAG and embeddings, knowing where they help, where they break, and when to reach for planning, reranking, structured reasoning, fine-tuning or deterministic compute instead.
  • Production discipline for AI systems: evaluation harnesses, guardrails and telemetry that change decisions (offline evals, golden sets, LLM-as-judge, behavioral regression, drift monitoring); and optimization for cost, latency, throughput and inference quality (model selection, serving decisions, token-spend control, caching, batching, streaming, distillation, quantization, speculative decoding).

Other signals

  • AI-native capabilities
  • AI-first data organization
  • novel AI techniques from research to production
  • AI products end-to-end
  • LLMs, agents, retrieval and evaluation
  • trustworthy data-driven products
  • AI foundation of our data platform
  • scalable and trustworthy AI products, agents and workflows
  • self-serve analytics, experimentation, and data engineering
  • AI systems end-to-end
  • retrieval, planning and reasoning
  • evaluation, guardrails and observability
  • deployment and the on-call rotation
  • cost, performance and inference-quality efficiency
  • model selection and serving decisions
  • optimizing latency, throughput and token economics
  • scaling AI capabilities sustainably
  • deep domain expertise across our data and AI stack
  • engineering excellence and responsible AI
  • state-of-the-art AI techniques, models, agentic patterns, evaluation methods
  • natural-language data interfaces
  • AI-accelerated pipeline development
  • intelligent alerting
  • modern AI patterns
  • AI-native developer tools
  • AI-native practices
  • LLM or agentic systems from prototype to production
  • modern AI stack
  • production-grade AI products composed of LLMs, foundation models, agents and deterministic components
  • human and machine consumption
  • inference-versus-compute boundaries
  • task decomposition across specialized models
  • orchestration of multi-step reasoning and tool use
  • graceful degradation under failure
  • machine learning and deep learning
  • modern models (transformers, LLMs) are trained, fine-tuned and evaluated
  • embeddings, loss functions and statistical rigor
  • diagnose whether a production issue is prompt, retrieval, model or data
  • modern LLM and agent frameworks (LangChain, LlamaIndex, Semantic Kernel, Google ADK or equivalent)
  • vector databases (FAISS, Chroma or similar)
  • agentic architectures, multi-agent coordination, tool invocation and stateful reasoning
  • vanilla RAG and embeddings
  • planning, reranking, structured reasoning, fine-tuning or deterministic compute
  • Production discipline for AI systems
  • evaluation harnesses, guardrails and telemetry
  • offline evals, golden sets, LLM-as-judge, behavioral regression, drift monitoring
  • optimization for cost, latency, throughput and inference quality
  • model selection, serving decisions, token-spend control, caching, batching, streaming, distillation, quantization, speculative decoding
  • data infrastructure ecosystem
  • SQL engines (such as Trino, Presto or Spark)
  • lakehouse architectures, workflow orchestration, and streaming systems
  • build AI capabilities that sit natively on top of it
  • strategic product mindset paired with a research sensibility
  • tackle loosely defined problems with meticulous attention to detail
  • drive ambiguous projects to completion
  • communicate clearly across cross-functional teams to influence product strategy
  • evangelize AI engineering practices
  • raises the AI fluency of partner organizations