What you'd actually do

Drive fundamental and applied research at the intersection of multi-modal generative AI, predictive world modeling, embodied reasoning, and robotic manipulation

Investigate or invent architectures that deliver a spectrum of embodied behaviors from simulated environments to real robots, and from tactile-driven motor control to high-level, long-horizon intelligence

Design research methodologies and lead empirical evaluations, authoring well-tested code for physical hardware and simulators

Build prototype systems that drive multi-step, long-horizon robotic perception, reasoning, and action

Contribute to and lead high-impact publications and open-sourcing efforts

Skills

Required

PhD in Computer Vision, Robotics, AI, or related field, or equivalent practical experience
Deep learning frameworks (e.g., Pytorch, Tensorflow)
Python
Experience with real-world robots and simulators
Experience with 3D computer vision algorithms
Experience training/evaluating 3D generative models, world models, or foundational AI models for embodied tasks
Experience with Vision-Language-Action models (VLAs)
Reinforcement Learning (RL)
Long-horizon planning
Kinematics
Sim2Real transfer
Demonstrated research and software engineering experience

Nice to have

multi-modal Generative World Models
predictive world modeling
embodied reasoning
robotic manipulation
active perception
robotic interaction
tactile-driven motor control
high-level, long-horizon intelligence
physical hardware and simulators
multi-step, long-horizon robotic perception, reasoning, and action
open-sourcing efforts
foundational AI models for embodied tasks

What the JD emphasized

PhD in the field of Computer Vision, Robotics, AI, Computer Science, a related field, or equivalent practical experience

Experience with real-world robots and simulators

Track record of results as demonstrated by grants, fellowships, patents, as well as publications at peer-reviewed workshops, journals, or conferences such as CVPR, CoRL, ICRA, RSS, NeurIPS, ECCV, ICCV, IROS, or similar

Meta Reality Labs Research (RL Research) brings together a team of researchers and engineers with the shared goal of developing AI, Robotics, and AR/VR technology across the spectrum. The Surreal Spatial AI group is seeking Research Scientists to build machine perception, 3D generative systems, and control technology allowing robotic agents and AI systems to perceive, understand, reason about, and interact with the world around them. In this role, you will develop algorithms for predictive world modeling, active perception, and robotic interaction. You will investigate architectures that position multi-modal Generative World Models at the core of machine perception and data-driven robotic control for real-world autonomous applications. Leveraging modalities available from egocentric devices and physical robotic platforms, your work will span the full stack, translating foundational world models into action in the physical world.

Responsibilities

Drive fundamental and applied research at the intersection of multi-modal generative AI, predictive world modeling, embodied reasoning, and robotic manipulation Investigate or invent architectures that deliver a spectrum of embodied behaviors from simulated environments to real robots, and from tactile-driven motor control to high-level, long-horizon intelligence Design research methodologies and lead empirical evaluations, authoring well-tested code for physical hardware and simulators Build prototype systems that drive multi-step, long-horizon robotic perception, reasoning, and action Contribute to and lead high-impact publications and open-sourcing efforts Identify long-term research goals while executing intermediate milestones Collaborate with a wide-ranging set of scientists and engineers across teams

Qualifications

Currently has or is in the process of obtaining a PhD in the field of Computer Vision, Robotics, AI, Computer Science, a related field, or equivalent practical experience. Degree must be completed prior to joining Meta Research experience involving 3D Computer Vision, Deep Learning, or Robotics—specifically related to multimodal or 3D generative modeling, predictive world models, scene understanding, or learning-based robotic control Experience with real-world system building and data collection, including design, coding, and evaluation with modern ML methods Experience communicating research for public audiences of peers Experience with deep learning frameworks (e.g., Pytorch, Tensorflow) and Python Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment Direct experience working with real-world robots and simulators Experience with 3D computer vision algorithms and training/evaluating 3D generative models, world models, or foundational AI models for embodied tasks Experience with Vision-Language-Action models (VLAs), Reinforcement Learning (RL), long-horizon planning, kinematics, or Sim2Real transfer Demonstrated research and software engineering experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g., GitHub) Track record of results as demonstrated by grants, fellowships, patents, as well as publications at peer-reviewed workshops, journals, or conferences such as CVPR, CoRL, ICRA, RSS, NeurIPS, ECCV, ICCV, IROS, or similar

Research Scientist – World Models, Robotics & Embodied AI

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Responsibilities

Qualifications