Research Scientist Intern, Photorealistic Telepresence (phd)

Meta Meta · Big Tech · Sausalito, CA +2

Research Scientist Intern at Meta focused on photorealistic telepresence and autonomous social agents in AR/VR. The role involves generative AI for image/video synthesis, digital human motion, social signal encoding, face/body reconstruction, and multimodal LLMs (speech-to-speech, audio-visual). Requires PhD in a related field, ML experience, deep learning frameworks, Python, and a track record of publications/patents.

What you'd actually do

  1. Solve research problems in enabling photorealistic telepresence and autonomous social agents.
  2. Collaboration with and support of other researchers across various disciplines.
  3. Communication of research agenda, progress, and results.

Skills

Required

  • PhD in Computer Science, Computer Vision, Computer Graphics, Robotics, Machine Learning, or related field
  • Experience with solving “inverse problems” in imaging emphasizing modeling and algorithm development
  • 2+ years of experience with Machine Learning for solving computer vision and computer graphics problems
  • Experience with deep learning frameworks such as Pytorch and TensorBoard
  • Experience with scientific programming languages such as Python
  • Proven track record of achieving significant results as demonstrated by patents and first-authored publications at leading workshops or conferences such as ICCV, CVPR, NeurIPS, SIGGRAPH, ICASSP, or similar
  • Intent to return to a degree-program after the completion of the internship
  • Experience working and communicating cross functionally in a team environment
  • Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)
  • Experience with systems building in Python or C++
  • Experience with large-scale generative models such as LLMs and video diffusion models
  • Experience with Machine Learning for 3D data (such as meshes, point clouds, gaussian splatting, and voxels)
  • Experience with Machine Learning for audio and visual synthesis

Nice to have

  • work authorization in the country of employment

What the JD emphasized

  • PhD in Computer Science, Computer Vision, Computer Graphics, Robotics, Machine Learning, or related field
  • Experience with solving “inverse problems” in imaging emphasizing modeling and algorithm development
  • 2+ years of experience with Machine Learning for solving computer vision and computer graphics problems
  • Experience with deep learning frameworks such as Pytorch and TensorBoard
  • Experience with scientific programming languages such as Python
  • Proven track record of achieving significant results as demonstrated by patents and first-authored publications at leading conferences

Other signals

  • Generative AI models for image and video synthesis
  • Motion and behavior synthesis for digital humans
  • VR/AR encoding of social signals
  • Face and body reconstruction and tracking
  • Multimodal LLMs, such as speech-to-speech LLMs and audio-visual LLMs