Helix AI Engineer, Reinforcement Learning

at Figure AI · Robotics · HQ · AI - Helix Team

Figure AI is seeking a Helix AI Engineer specializing in Reinforcement Learning to develop learning systems for humanoid robots. This role involves designing, implementing, and training RL algorithms for embodied agents in both simulated and real-world environments, focusing on improving policy performance, robustness, and long-horizon decision-making. The engineer will work on reward modeling, credit assignment, exploration, and integrating RL into the full autonomy stack, while also building scalable training systems and evaluation frameworks.

What you'd actually do

  1. Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments
  2. Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks
  3. Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors
  4. Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability
  5. Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management

Skills

Required

  • Experience developing and applying reinforcement learning algorithms in complex environments
  • Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL)
  • Experience training policies in simulation and/or real-world systems
  • Proficiency in Python and deep learning frameworks such as PyTorch
  • Experience with large-scale experimentation and distributed training systems
  • Strong experimental rigor and ability to diagnose and improve learning systems
  • Solid software engineering skills and ability to build scalable, reliable systems
  • Ability to operate independently and drive ambiguous, high-impact technical problems

Nice to have

  • Experience applying RL to robotics, control systems, or embodied AI
  • Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale)
  • Background in offline RL, imitation learning, or hybrid learning approaches
  • Experience with reward modeling or human-in-the-loop learning
  • Experience at leading AI labs such as OpenAI, Google DeepMind, Anthropic, or xAI
  • Familiarity with robotics systems, simulation environments, or real-world deployment constraints
  • Publication record in reinforcement learning, machine learning, or robotics

What the JD emphasized

  • reinforcement learning algorithms
  • embodied agents
  • real-world
  • simulated environments
  • policy performance
  • robustness
  • long-horizon decision-making
  • reward modeling
  • credit assignment
  • exploration strategies
  • complex, long-horizon behaviors
  • policy robustness
  • noise
  • partial observability
  • environment variability
  • online and offline RL
  • large-scale logged robot data
  • pretraining
  • video
  • generative
  • agent
  • robot learning teams
  • full autonomy stack
  • scalable training systems
  • distributed rollouts
  • simulation infrastructure
  • experiment management
  • evaluation frameworks
  • policy performance
  • stability
  • generalization
  • reinforcement learning algorithms
  • complex environments
  • RL fundamentals
  • policy optimization
  • value methods
  • model-based RL
  • training policies
  • simulation
  • real-world systems
  • Python
  • deep learning frameworks
  • PyTorch
  • large-scale experimentation
  • distributed training systems
  • experimental rigor
  • diagnose and improve learning systems
  • software engineering skills
  • scalable, reliable systems
  • operate independently
  • drive ambiguous, high-impact technical problems
  • RL to robotics
  • control systems
  • embodied AI
  • large-scale RL infrastructure
  • distributed rollouts
  • simulation at scale
  • offline RL
  • imitation learning
  • hybrid learning approaches
  • reward modeling
  • human-in-the-loop learning
  • leading AI labs
  • OpenAI
  • Google DeepMind
  • Anthropic
  • xAI
  • robotics systems
  • simulation environments
  • real-world deployment constraints
  • publication record
  • reinforcement learning
  • machine learning
  • robotics

Other signals

  • Reinforcement Learning
  • Embodied AI
  • Robotics
  • Policy Optimization
  • Simulation
  • Real-world deployment
Read full job description

Figure is an AI robotics company developing autonomous general-purpose humanoid robots. Our goal is to build embodied AI systems that can perceive, reason, and act in the real world. Figure is headquartered in San Jose, CA, and this role requires 5 days/week in-office collaboration.

Our Helix team is responsible for developing the core AI systems that power humanoid autonomy. We are looking for a Helix AI Engineer, Reinforcement Learning to develop learning systems that enable robots to acquire skills through interaction, feedback, and experience.

This role focuses on applying and advancing reinforcement learning across simulation and real-world environments—improving policy performance, robustness, and long-horizon decision-making in embodied systems.

Responsibilities

  • Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments
  • Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks
  • Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors
  • Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability
  • Work across online and offline RL settings, including learning from large-scale logged robot data
  • Collaborate closely with pretraining, video, generative, agent, and robot learning teams to integrate RL into the full autonomy stack
  • Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management
  • Design evaluation frameworks to measure policy performance, stability, and generalization

Requirements

  • Experience developing and applying reinforcement learning algorithms in complex environments
  • Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL)
  • Experience training policies in simulation and/or real-world systems
  • Proficiency in Python and deep learning frameworks such as PyTorch
  • Experience with large-scale experimentation and distributed training systems
  • Strong experimental rigor and ability to diagnose and improve learning systems
  • Solid software engineering skills and ability to build scalable, reliable systems
  • Ability to operate independently and drive ambiguous, high-impact technical problems

Bonus Qualifications

  • Experience applying RL to robotics, control systems, or embodied AI
  • Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale)
  • Background in offline RL, imitation learning, or hybrid learning approaches
  • Experience with reward modeling or human-in-the-loop learning
  • Experience at leading AI labs such as OpenAI, Google DeepMind, Anthropic, or xAI
  • Familiarity with robotics systems, simulation environments, or real-world deployment constraints
  • Publication record in reinforcement learning, machine learning, or robotics

The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.