Lead Data Scientist, Content Intelligence

Disney Disney · Media · Glendale, CA +2

Lead Data Scientist role focused on developing and deploying AI/ML techniques to understand media content. The role involves conceptualizing and co-developing AI/ML solutions, prompt engineering, evaluation of VLM outputs, developing evaluation metrics for temporal tasks, bridging experimentation and production, establishing best practices for generative and non-generative AI/ML, and experimenting with techniques like PEFT, Quantization, and Vector embeddings for media retrieval. It requires proficiency in Python, deep learning frameworks, LLM prompting and fine-tuning, and cloud platforms.

What you'd actually do

  1. Conceptualize and co-develop innovative AI/ML solutions to address complex business challenges in the media space, contributing to a culture of rapid innovation.
  2. Identify systematic frameworks for prompt engineering and evaluation, ensuring VLM outputs meet strict ground truth requirements.
  3. Develop rigorous evaluation metrics (beyond standard accuracy) to validate model performance for temporal tasks
  4. Bridge experimentation and production through partnership with product and engineering teams, ensuring experimental models are aligned with business needs, optimized for inference and can be seamlessly integrated into our cloud platforms.
  5. Establish and refine best practices for the application of generative and non-generative AI/ML techniques, ensuring optimal performance for both experimental and production use cases.

Skills

Required

  • Python
  • Pandas
  • NumPy
  • scikit-learn
  • PyTorch
  • LLM prompting
  • LLM fine-tuning
  • full lifecycle of a machine learning project
  • model deployment
  • monitoring
  • cloud platforms
  • AWS S3
  • AWS Bedrock
  • AWS EC2
  • AWS SageMaker
  • Databricks

Nice to have

  • MS or PhD in Computer Science, Data Science, or a related scientific/engineering field
  • Previous work experience in media or entertainment technology
  • content distribution
  • media supply chain
  • VLM / LLM model distillation
  • advanced statistical techniques
  • experimental design
  • generative AI for media understanding
  • video analysis
  • image analysis
  • text analysis

What the JD emphasized

  • hands-on role
  • builder and systems thinker
  • functional, high-performance prototype
  • prompt engineering
  • evaluation
  • VLM outputs
  • ground truth requirements
  • rigorous evaluation metrics
  • model performance
  • temporal tasks
  • experimentation and production
  • optimized for inference
  • seamlessly integrated
  • generative and non-generative AI/ML techniques
  • optimal performance
  • experimental and production use cases
  • PEFT (LoRA/QLoRA)
  • Quantization
  • Vector embeddings-based media retrieval
  • full lifecycle of a machine learning project
  • model deployment and monitoring

Other signals

  • AI/ML solutions for media understanding
  • Prompt engineering and evaluation for VLM outputs
  • Bridging experimentation and production for AI models
  • Generative and non-generative AI/ML techniques
  • PEFT, Quantization, Vector embeddings for media retrieval