On-device ML Infrastructure Engineer (coreml Runtime), Graphics, Games & ML

Apple Apple · Big Tech · Cupertino, CA · Machine Learning and AI

This role focuses on building and maintaining the Core ML Runtime for on-device execution of ML models on Apple products. The engineer will work on the ML graph compiler, runtime, and kernels, optimizing model execution for performance, energy efficiency, and thermal management. The role involves developing production-critical system software for implementing ML models on Apple Silicon, with a focus on common compiler optimizations and runtime systems.

What you'd actually do

  1. Architect and maintain the on-device graph compiler, runtime, and kernels for delivering ML operators.
  2. Develop production-critical system software for implementing ML models on Apple Silicon
  3. Proactively identify and resolve functionality gaps.
  4. Optimize model execution for various system objectives like performance, energy efficiency, and thermal management.

Skills

Required

  • C++
  • Swift
  • Python
  • Compiler stack experience (MLIR/LLVM/TVM/etc.)
  • Operating Systems
  • Embedding programming
  • Parallel programming
  • ML fundamentals
  • Transformer architectures

Nice to have

  • On-device ML stack experience (TFLite, ONNX, ExecuTorch, etc.)
  • ML authoring framework experience (PyTorch, TensorFlow, JAX, etc.)
  • Accelerators
  • GPU programming

What the JD emphasized

  • Masters or equivalent experience in Computer Sciences, Engineering, or related subject area.
  • Highly proficient in C++ or Swift.
  • Experience with any compiler stack (MLIR/LLVM/TVM/...).
  • Familiarity with Operating Systems, embedding programming, parallel programming.
  • Sound understanding of ML fundamentals, including common architectures such as Transformers.

Other signals

  • enabling billions of Apple devices to run powerful AI models locally, privately, and efficiently
  • building the essential infrastructure that enables machine learning at scale on Apple devices
  • onboarding innovative architectures to embedded systems
  • developing optimization toolkits for model compression and acceleration
  • building ML compilers and runtimes for efficient execution
  • creating comprehensive benchmarking and debugging toolchains
  • forms the backbone of Apple’s machine learning workflows
  • work on the next generation of intelligent experiences on Apple platforms
  • building an end-to-end developer experience for machine learning development
  • build the world’s most advanced ML graph compilation and runtime system
  • optimizing and delivering ML models efficiently on Apple products and services