Aiml - Sr Machine Learning Engineer, Data and ML Innovation

Apple Apple · Big Tech · Sunnyvale, CA · Machine Learning and AI

Senior Machine Learning Engineer at Apple focused on innovating and applying state-of-the-art research in foundation models, particularly for audio data. The role involves the full ML pipeline from pre-training on large-scale unlabeled audio corpora to post-training evaluation and fine-tuning. Responsibilities include designing multi-modal data generation frameworks, building model evaluation pipelines, analyzing multi-modal data, and contributing to products with multi-modal perception data, especially audio and sensor fusion. The role also emphasizes representation learning, pre-training/fine-tuning for speech tasks, data selection techniques, and modeling data distributions. Collaboration with researchers and engineers is key, with opportunities for publishing groundbreaking research.

What you'd actually do

  1. Enhancing current products and future hardware platforms with multi-modal perception data, particularly through audio and sensor fusion techniques.
  2. Designing self-supervised and semi-supervised representation learning pipelines, along with audio-specific pre-training and fine-tuning strategies for tasks like speech recognition and speaker identification.
  3. Applying data selection techniques such as novelty detection and active learning across modalities—audio, vision, language, and 3D—to improve data efficiency and reduce distributional gaps.
  4. Modeling data distributions using modern ML/statistical methods to uncover patterns, reduce redundancy, and handle out-of-distribution challenges.
  5. Rapidly learning new methods and domains as needed, and guiding product teams in selecting effective ML solutions.

Skills

Required

  • Deep technical skills in one or more machine learning areas, such as computer vision, audio, combinatorial optimization, causality analysis, natural language processing, and deep learning.
  • Strong software development skills with proficiency in Python
  • hands-on experience working with deep learning toolkits like PyTorch, TensorFlow, or JAX (one of).
  • 5+ years of experience developing and evaluating ML applications

Nice to have

  • Deep understanding of multi-modal foundation models.
  • Staying up-to-date with emerging trends in generative AI and multi-modal LLMs.
  • The ability to formulate machine learning problems, design, experiment, implement, and communicate solutions effectively.
  • Hands-on mentality to own engineering projects from inception to shipping products and the ability to work independently and as part of a cross-functional team.
  • Demonstrated publication records in relevant conferences (e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, etc.).
  • Track records of adopting ML to solve cross-disciplinary problems.

What the JD emphasized

  • state-of-the-art research
  • foundation models
  • audio data
  • pre-training
  • post-training evaluation
  • fine-tuning
  • multi-modal data generation and curation framework
  • model evaluation pipelines
  • analyzing multi-modal data
  • multi-modal perception data
  • audio and sensor fusion techniques
  • representation learning pipelines
  • audio-specific pre-training and fine-tuning strategies
  • speech recognition
  • speaker identification
  • data selection techniques
  • novelty detection
  • active learning
  • modeling data distributions
  • ML/statistical methods
  • out-of-distribution challenges
  • multi-modal foundation models
  • generative AI
  • multi-modal LLMs
  • shipping products
  • publication records

Other signals

  • foundation models
  • multi-modal
  • agent and reasoning capabilities
  • audio data
  • pre-training
  • post-training evaluation
  • fine-tuning
  • representation learning
  • speech recognition
  • speaker identification
  • data selection
  • novelty detection
  • active learning
  • modeling data distributions
  • ML/statistical methods
  • out-of-distribution challenges
  • PyTorch
  • TensorFlow
  • JAX
  • generative AI
  • multi-modal LLMs
  • cross-disciplinary problems