Research Scientist, Vision Foundation Model

ByteDance ByteDance · Big Tech · San Jose, CA · Computer vision

Research Scientist focused on foundational models for visual generation and multimodal generative models. The role involves research and development to enhance strategic advantages for ByteDance products, with a focus on computer vision challenges. Experience with large-scale training and deep learning frameworks is preferred.

What you'd actually do

  1. Conduct research and development in visual foundation generative models
  2. Develop foundation models to enhance the strategic advantages for ByteDance products

Skills

Required

  • Master's or PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline
  • Experience in research and practical applications in one or more areas of computer vision and machine learning
  • Strong coding skills in Python
  • Work and collaborate well with team members

Nice to have

  • Hands-on coding experience in deep learning frameworks (e.g., PyTorch)
  • large-scale training experience
  • Highly competent in algorithms and programming
  • Experience in solving real-world machine learning technical problems
  • Experience in large-scale image and video training is preferred, particularly when it involves extensive work with foundation models

What the JD emphasized

  • publications in accredited venues such as CVPR, ECCV, ICCV, NeurIPS, ICLR, ICML, SIGGRAPH or Multimedia, etc.

Other signals

  • foundation models
  • multimodal
  • visual generation
  • large-scale training