AI Research Scientist

Meta Meta · Big Tech · Redmond, WA

Research Scientist at Meta focused on advancing multi-modal understanding by developing models and systems that reason across text, images, video, and audio, with the goal of impacting products used by billions.

What you'd actually do

  1. Conduct research on multi-modal learning, including vision-language models, audio-visual understanding, and cross-modal reasoning
  2. Develop novel architectures and training methodologies for models that integrate and reason across multiple modalities
  3. Design and execute experiments to evaluate multi-modal model capabilities and identify areas for improvement
  4. Publish research findings at top-tier conferences and contribute to Meta's research community
  5. Collaborate with cross-functional teams to translate research innovations into product applications

Skills

Required

  • PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field
  • Experience with multi-modal learning, vision-language models, or cross-modal representation learning demonstrated through publications or projects
  • Experience programming in Python
  • Experience with deep learning frameworks such as PyTorch
  • Experience with large-scale model training and distributed computing
  • Experience building end-to-end multi-modal systems from research to production

Nice to have

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Experience with video understanding or audio-visual learning
  • Experience with large language models, vision transformers, or foundation models

What the JD emphasized

  • Publications at venues such as NeurIPS, ICML, ICLR, CVPR, ACL, or EMNLP focused on multi-modal learning

Other signals

  • multi-modal understanding
  • reason across multiple modalities
  • cutting-edge research
  • impact billions of users