Sr. Machine Learning Infrastructure Engineer, Creator Studio

Apple Apple · Big Tech · Culver City +1 · Software and Services

This role focuses on building and maintaining scalable ML data infrastructure for creative editing tools, enabling high-quality ML model development. It involves sourcing multimodal datasets, building data pipelines, and supporting model development teams.

What you'd actually do

  1. Identify, source and ingest multimodal datasets for model development, working backwards from a deep understanding of application features
  2. Build scalable and reusable infrastructure components for data pipelines, such as labeling via human annotations or LLMs
  3. Build and maintain core tooling and frameworks to support model development, including scalable storage and retrieval systems, A/B comparison visualizations and test harnesses to ensure training vs. inference-time parity
  4. Partner with model development teams to manage a shared codebase, build common data processing libraries and profile/optimize ML workloads
  5. Analyze real-world user interaction data to uncover gaps in training data distributions and derive model success metrics

Skills

Required

  • BS/MS in Computer Science or related field with 3+ years of relevant industry experience
  • Proficiency in a high-level programming language (preferably Python)
  • Proficiency in database query language (e.g., SQL)
  • Strong understanding of software engineering best practices, especially around data modeling, schema design and building maintainable data access layers
  • Experience developing and optimizing large-scale ML workloads running on distributed processing frameworks
  • Strong understanding of an ML-based product lifecycle
  • Ability to communicate effectively and collaborate with partner teams, particularly ML research scientists and engineers

Nice to have

  • Working knowledge of modern database and distributed data processing technologies/frameworks (e.g., Spark, Dask, Ray, Presto, Parquet, Flink, Druid, Airflow, PostgreSQL)
  • Experience with Python package management and build/deployment tooling (e.g., uv, poetry, hatch)
  • Experience optimizing models and algorithms to run efficiently on resource-constrained platforms
  • Knowledge of cloud platforms and container orchestration technologies (e.g., Kubernetes, Docker, AWS, GCP)
  • Familiarity with iOS ecosystem including the Swift programming language
  • Familiarity with model architectures and various training techniques, particularly in the Computer Vision domain
  • Familiarity with modern camera ISP and digital image processing algorithms and models
  • Knowledge and keen interest in learning the art and science of photography

Other signals

  • ML data infrastructure
  • ML model development
  • creative editing tools
  • multimodal datasets
  • ML workloads