Senior Data Engineer

Apple Apple · Big Tech · Cupertino, CA +1 · Software and Services

Senior Data Engineer for Apple's App Store team, focusing on designing and delivering privacy-centric data products, including pipelines and analytical outputs. The role involves architecting distributed pipelines, building self-service data platforms, and implementing GenAI-driven observability. Collaboration with various Apple teams is key, and the ideal candidate has strong software engineering and data expertise.

What you'd actually do

  1. Architect and scale distributed data pipelines using Spark, Flink, Cassandra and Kafka to process high-throughput App Store data.
  2. Lead the technical design of privacy-first data models and analytics-ready datasets using Python, Scala, or Java.
  3. Engineer robust data observability, monitoring, and automated recovery systems for production environments.
  4. Partner with Data Science, Product, and Privacy teams to translate complex business and regulatory requirements into scalable engineering specifications.
  5. Champion modern engineering practices, including CI/CD, rigorous testing, and the adoption of AI-augmented development workflows to accelerate delivery.

Skills

Required

  • Scala or Java
  • Python
  • SQL
  • Spark/Flink
  • Kafka
  • Airflow
  • Iceberg
  • Trino
  • Cassandra
  • Kubernetes
  • Data modeling
  • Data warehouse design
  • Distributed data systems
  • CI/CD
  • Testing

Nice to have

  • online advertising measurement
  • attribution modeling
  • incrementality testing
  • conversion measurement
  • campaign optimization
  • privacy-enhancing technologies
  • differential privacy
  • federated learning
  • secure multi-party computation
  • on-device intelligence
  • LLM-driven or agentic workflows
  • anomaly detection
  • pipeline self-healing
  • data quality enforcement

What the JD emphasized

  • GenAI-driven observability
  • AI-augmented development workflows
  • LLM-driven or agentic workflows for automated data operations

Other signals

  • GenAI-driven observability
  • AI-augmented development workflows
  • LLM-driven or agentic workflows for automated data operations