Meta Reality Labs Research (RL Research) brings together a team of researchers and engineers with the shared goal of developing AI, Robotics, and AR/VR technology across the spectrum. The Surreal Spatial AI group is seeking Research Scientists to build machine perception, 3D generative systems, and control technology allowing robotic agents and AI systems to perceive, understand, reason about, and interact with the world around them. In this role, you will develop algorithms for predictive world modeling, active perception, and robotic interaction. You will investigate architectures that position multi-modal Generative World Models at the core of machine perception and data-driven robotic control for real-world autonomous applications. Leveraging modalities available from egocentric devices and physical robotic platforms, your work will span the full stack, translating foundational world models into action in the physical world.
Responsibilities
Drive fundamental and applied research at the intersection of multi-modal generative AI, predictive world modeling, embodied reasoning, and robotic manipulation Investigate or invent architectures that deliver a spectrum of embodied behaviors from simulated environments to real robots, and from tactile-driven motor control to high-level, long-horizon intelligence Design research methodologies and lead empirical evaluations, authoring well-tested code for physical hardware and simulators Build prototype systems that drive multi-step, long-horizon robotic perception, reasoning, and action Contribute to and lead high-impact publications and open-sourcing efforts Identify long-term research goals while executing intermediate milestones Collaborate with a wide-ranging set of scientists and engineers across teams
Qualifications
Currently has or is in the process of obtaining a PhD in the field of Computer Vision, Robotics, AI, Computer Science, a related field, or equivalent practical experience. Degree must be completed prior to joining Meta Research experience involving 3D Computer Vision, Deep Learning, or Robotics—specifically related to multimodal or 3D generative modeling, predictive world models, scene understanding, or learning-based robotic control Experience with real-world system building and data collection, including design, coding, and evaluation with modern ML methods Experience communicating research for public audiences of peers Experience with deep learning frameworks (e.g., Pytorch, Tensorflow) and Python Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment Direct experience working with real-world robots and simulators Experience with 3D computer vision algorithms and training/evaluating 3D generative models, world models, or foundational AI models for embodied tasks Experience with Vision-Language-Action models (VLAs), Reinforcement Learning (RL), long-horizon planning, kinematics, or Sim2Real transfer Demonstrated research and software engineering experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g., GitHub) Track record of results as demonstrated by grants, fellowships, patents, as well as publications at peer-reviewed workshops, journals, or conferences such as CVPR, CoRL, ICRA, RSS, NeurIPS, ECCV, ICCV, IROS, or similar