AI Data Scientist

Apple Apple · Big Tech · Shanghai, China · Machine Learning and AI

This role focuses on evaluating, optimizing, and analyzing the performance of ML and multi-modal LLMs. The Data Scientist will develop metrics, conduct failure analysis, process data for evaluation, and implement optimization techniques. They will collaborate with cross-functional teams to integrate models and communicate results. The role requires experience with model evaluation, RAG, and LLM prompt evaluation, with preferred experience in multi-modal foundation models and GenAI frameworks.

What you'd actually do

  1. Analyze, validate, and benchmark computer vision, multi-modal, and large language models(LLMs) to ensure they meet accuracy, robustness, and usability standards, utilizing techniques such as A/B testing and cross-validation, and other model evaluation methods
  2. Design and implement performance and evaluation metrics to measure model efficiency, accuracy, and scalability in real-world production environments.
  3. Conduct root cause analysis on model failures across computer vision and multi-modal language model pipelines, identifying improvement areas and collaborating with relevant teams to implement solution.
  4. Clean, transform, and curate large-scale structured and unstructured datasets, facilitating efficient model evaluation, benchmarking, and testing across diverse data modalities
  5. Implement innovative model optimization techniques (e.g. model distillation, quantization, pruning) to improve model scalability, performance, and real-world deployment.

Skills

Required

  • 3+ years of experience in data science, machine learning, data analysis, or data modeling
  • strong focus on model evaluation, accuracy, and performance metrics
  • Familiarity with vector similarity search, retrieval-augmented generation(RAG) architectures, and LLM prompt evaluation techniques
  • Advanced programming skills in data manipulation, data processing, and building scalable data pipelines ( SQL & Python preferred)
  • Experience crafting, conducting, analyzing, and interpreting experiments and investigations

Nice to have

  • Experience working with multi-modal foundation models (e.g. GPT-4, Gemini 2.5, Claude 3/4, LLaVA, Flamingo) in practical application such as model training, evaluation, and optimization.
  • Hands-on experience with LLMs and GenAI frameworks (e.g. LangChain, LlamaIndex) for developing and optimizing AI-driven applications
  • Familiarity with embedding, retrieval algorithms, agents, and data modeling for vector development graphs.
  • Comfort with ambiguity, with the ability to structure complex analysis and drive insights through data exploration and strategy research.
  • Proven experience managing complex projects and collaborating across cross-functional teams.
  • Detail-oriented to keep track of and understand the workings of sophisticated algorithms.
  • Strong experience articulating and translating business questions into data solutions.
  • Curious, self-motivated and able to drive improvements to model evaluation pipelines and annotation programs.
  • Eagerness and ability to learn new skills and solve dynamic problems in an encouraging and expansive environment.
  • Outstanding communication skills – both written and verbal – with experience presenting to leadership.
  • distributed computing

What the JD emphasized

  • model evaluation
  • accuracy
  • performance metrics
  • model optimization
  • multi-modal foundation models

Other signals

  • model evaluation
  • performance metrics
  • failure analysis
  • model optimization
  • multi-modal LLMs