What you'd actually do

Engage with business and analytics teams to deeply understand data needs and translate requirements into robust, scalable engineering solutions that directly impact Operations decisions

Design and implement end-to-end data pipelines and architectures from ingestion and transformation to delivery across batch and real-time streaming workloads

Build and maintain high-quality data models (dimensional, relational, or knowledge graph-based) using modern transformation frameworks such as dbt, powering analytics and AIML use cases at scale

Architect and operate data workflows using orchestration tools (e.g., Apache Airflow, etc) with built-in monitoring, alerting, and SLA management

Implement data observability, lineage tracking, and validation frameworks to uphold data integrity and trustworthiness across the platform

Skills

Required

Python
SQL
dbt
Spark
Kafka/Flink
Snowflake
Delta Lake
Apache Iceberg
Apache Airflow
Docker
Kubernetes
GenAI tooling
Agentic AI tooling
LLM-assisted code generation
Vector databases
RAG pipelines
Data visualization
Self-service analytics platforms

Nice to have

MS in Computer Science, Data Engineering, Statistics, Applied Math, Data Science, Operations Research
Tableau
Streamlit
ThoughtSpot

What the JD emphasized

8+ years of industry experience OR BS in related field with 10+ years hands-on industry experience

Domain expertise in supply chain, operations management, logistics, planning & forecasting, production integration, channel management

Demonstrated expertise building and operating large-scale ETL/ELT pipelines using Python, SQL, and modern frameworks (dbt, Spark, Kafka/Flink for streaming)

Proficiency with cloud data platforms (e.g. Snowflake) and open table formats (Delta Lake, Apache Iceberg)

Strong command of advanced SQL for complex data modeling, query optimization, and analytics engineering

Experience with workflow orchestration tools (Apache Airflow or equivalent) and building production-grade, monitored pipelines

Hands-on experience implementing data quality frameworks, observability tooling, and data lineage tracking in production environments

Experienced with implementation and productionalization of GenAI and Agentic AI tooling including LLM-assisted code generation, MCP servers, and AI-powered data pipeline automation

Track record of staying current with industry best practices, rapidly adopting emerging technologies (e.g., vector databases, RAG pipelines, AI-native data tools), and building functional prototypes to validate concepts

Other signals

designing and building modern, scalable data infrastructure that powers analytics, machine learning, and AI-driven decision-making

operationalize models and ensure data infrastructure supports production AIML workflows

Leverage AI-assisted development tools (e.g., GitHub, Claude) and LLM-powered agents to accelerate pipeline authoring, code review, documentation, and transformation logic generation from natural language specifications

Research and evaluate emerging data engineering technologies including streaming architectures, GenAI-powered data tooling, and next-generation warehousing to expand the team’s capabilities and accelerate innovation

Apple is where extraordinary people do their best work. If making a real impact excites you, a career here might be your dream — just be prepared to dream big.

Apple’s growing supply chain complexity demands innovative approaches beyond traditional data engineering. You’ll join a team designing and building modern, scalable data infrastructure that powers analytics, machine learning, and AI-driven decision-making across Operations. You’re passionate about building reliable data systems, staying ahead of technology trends, and thrive navigating ambiguity in a fast-paced environment. If this sounds like you, we’d love to talk.

Description

Engage with business and analytics teams to deeply understand data needs and translate requirements into robust, scalable engineering solutions that directly impact Operations decisions Design and implement end-to-end data pipelines and architectures from ingestion and transformation to delivery across batch and real-time streaming workloads Build and maintain high-quality data models (dimensional, relational, or knowledge graph-based) using modern transformation frameworks such as dbt, powering analytics and AIML use cases at scale Architect and operate data workflows using orchestration tools (e.g., Apache Airflow, etc) with built-in monitoring, alerting, and SLA management Implement data observability, lineage tracking, and validation frameworks to uphold data integrity and trustworthiness across the platform Collaborate with Data Scientists, ML Engineers, Software Engineers and Analysts to operationalize models and ensure data infrastructure supports production AIML workflows Partner with infrastructure and platform teams to manage cloud-native data environments (Snowflake, Spark, Delta Lake / Apache Iceberg) with a focus on performance, cost efficiency, and scalability Leverage AI-assisted development tools (e.g., GitHub, Claude) and LLM-powered agents to accelerate pipeline authoring, code review, documentation, and transformation logic generation from natural language specifications Apply DataOps principles including CI/CD pipelines, version control, automated testing, and containerization (Docker, Kubernetes) to deliver reliable, production-grade data products Champion a data product mindset, enabling self-serve analytics and reducing bottlenecks for downstream consumers Tune query performance, partitioning strategies, and storage optimization for data at scale in cloud warehouses and lakehouses Develop and maintain clear technical documentation including data dictionaries, lineage diagrams, and architecture decision records Present data infrastructure capabilities, health metrics, and architectural recommendations to senior leadership in clear, non-technical terms Research and evaluate emerging data engineering technologies including streaming architectures, GenAI-powered data tooling, and next-generation warehousing to expand the team’s capabilities and accelerate innovation

Minimum Qualifications

MS in Computer Science, Data Engineering, Statistics, Applied Math, Data Science, Operations Research or a related field and 8+ years of industry experience OR BS in related field with 10+ years hands-on industry experience Domain expertise in supply chain, operations management, logistics, planning & forecasting, production integration, channel management Demonstrated expertise building and operating large-scale ETL/ELT pipelines using Python, SQL, and modern frameworks (dbt, Spark, Kafka/Flink for streaming) Proficiency with cloud data platforms (e.g. Snowflake) and open table formats (Delta Lake, Apache Iceberg) Strong command of advanced SQL for complex data modeling, query optimization, and analytics engineering Experience with workflow orchestration tools (Apache Airflow or equivalent) and building production-grade, monitored pipelines Hands-on experience implementing data quality frameworks, observability tooling, and data lineage tracking in production environments Experienced with implementation and productionalization of GenAI and Agentic AI tooling including LLM-assisted code generation, MCP servers, and AI-powered data pipeline automation Experience with data visualization and self-service analytics platforms (e.g., Tableau, Streamlit, ThoughtSpot) and the ability to build light front-end data products Track record of staying current with industry best practices, rapidly adopting emerging technologies (e.g., vector databases, RAG pipelines, AI-native data tools), and building functional prototypes to validate concepts

Preferred Qualifications

Ability to work well in a fast-paced, iterative environment and deliver projects under timeline pressures Champion a culture of experimentation and continuous learning, bringing innovative and strategic thinking to reporting, business analytics, and AI-powered automation Exceptional ability to communicate complex data architecture decisions clearly to both technical peers and non-technical senior stakeholders Strong interpersonal and collaboration skills to partner effectively across functions, share knowledge, and integrate diverse feedback Self-sufficient with an ability to thrive in an environment of autonomy amidst ambiguity, with a high bias for action and meticulous attention to data integrity

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant

At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.

Learn about accessibility in Apple’s workplace

Learn about reasonable accommodations for job applicants

Apple accepts applications to this posting on an ongoing basis.