Helix AI Engineer, Reinforcement Learning

Figure AI · Robotics · HQ · AI - Helix Team

Figure AI is seeking a Helix AI Engineer specializing in Reinforcement Learning to develop learning systems for humanoid robots. This role involves designing, implementing, and training RL algorithms for embodied agents in both simulated and real-world environments, focusing on improving policy performance, robustness, and long-horizon decision-making. The engineer will work on reward modeling, credit assignment, exploration, and integrating RL into the full autonomy stack, while also building scalable training systems and evaluation frameworks.

What you'd actually do

Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments
Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks
Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors
Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability
Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management

Skills

Required

Experience developing and applying reinforcement learning algorithms in complex environments
Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL)
Experience training policies in simulation and/or real-world systems
Proficiency in Python and deep learning frameworks such as PyTorch
Experience with large-scale experimentation and distributed training systems
Strong experimental rigor and ability to diagnose and improve learning systems
Solid software engineering skills and ability to build scalable, reliable systems
Ability to operate independently and drive ambiguous, high-impact technical problems

Nice to have

Experience applying RL to robotics, control systems, or embodied AI
Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale)
Background in offline RL, imitation learning, or hybrid learning approaches
Experience with reward modeling or human-in-the-loop learning
Experience at leading AI labs such as OpenAI, Google DeepMind, Anthropic, or xAI
Familiarity with robotics systems, simulation environments, or real-world deployment constraints
Publication record in reinforcement learning, machine learning, or robotics

What the JD emphasized

reinforcement learning algorithms
embodied agents
real-world
simulated environments
policy performance
robustness
long-horizon decision-making
reward modeling
credit assignment
exploration strategies
complex, long-horizon behaviors
policy robustness
noise
partial observability
environment variability
online and offline RL
large-scale logged robot data
pretraining
video
generative
agent
robot learning teams
full autonomy stack
scalable training systems
distributed rollouts
simulation infrastructure
experiment management
evaluation frameworks
policy performance
stability
generalization
reinforcement learning algorithms
complex environments
RL fundamentals
policy optimization
value methods
model-based RL
training policies
simulation
real-world systems
Python
deep learning frameworks
PyTorch
large-scale experimentation
distributed training systems
experimental rigor
diagnose and improve learning systems
software engineering skills
scalable, reliable systems
operate independently
drive ambiguous, high-impact technical problems
RL to robotics
control systems
embodied AI
large-scale RL infrastructure
distributed rollouts
simulation at scale
offline RL
imitation learning
hybrid learning approaches
reward modeling
human-in-the-loop learning
leading AI labs
OpenAI
Google DeepMind
Anthropic
xAI
robotics systems
simulation environments
real-world deployment constraints
publication record
reinforcement learning
machine learning
robotics

Other signals

Reinforcement Learning
Embodied AI
Robotics
Policy Optimization
Simulation
Real-world deployment

Read full job description

Figure is an AI robotics company developing autonomous general-purpose humanoid robots. Our goal is to build embodied AI systems that can perceive, reason, and act in the real world. Figure is headquartered in San Jose, CA, and this role requires 5 days/week in-office collaboration.

Our Helix team is responsible for developing the core AI systems that power humanoid autonomy. We are looking for a Helix AI Engineer, Reinforcement Learning to develop learning systems that enable robots to acquire skills through interaction, feedback, and experience.

This role focuses on applying and advancing reinforcement learning across simulation and real-world environments—improving policy performance, robustness, and long-horizon decision-making in embodied systems.

Responsibilities

Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments
Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks
Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors
Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability
Work across online and offline RL settings, including learning from large-scale logged robot data
Collaborate closely with pretraining, video, generative, agent, and robot learning teams to integrate RL into the full autonomy stack
Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management
Design evaluation frameworks to measure policy performance, stability, and generalization

Requirements

Experience developing and applying reinforcement learning algorithms in complex environments
Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL)
Experience training policies in simulation and/or real-world systems
Proficiency in Python and deep learning frameworks such as PyTorch
Experience with large-scale experimentation and distributed training systems
Strong experimental rigor and ability to diagnose and improve learning systems
Solid software engineering skills and ability to build scalable, reliable systems
Ability to operate independently and drive ambiguous, high-impact technical problems

Bonus Qualifications

Experience applying RL to robotics, control systems, or embodied AI
Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale)
Background in offline RL, imitation learning, or hybrid learning approaches
Experience with reward modeling or human-in-the-loop learning
Experience at leading AI labs such as OpenAI, Google DeepMind, Anthropic, or xAI
Familiarity with robotics systems, simulation environments, or real-world deployment constraints
Publication record in reinforcement learning, machine learning, or robotics

The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.

Responsibilities

Design and implement reinforcement learning algorithms for embodied agents operating in real-world and simulated environments
Train policies that learn from interaction, feedback, and large-scale experience across diverse tasks
Develop reward modeling, credit assignment, and exploration strategies for complex, long-horizon behaviors
Improve policy robustness to real-world challenges such as noise, partial observability, and environment variability
Work across online and offline RL settings, including learning from large-scale logged robot data
Collaborate closely with pretraining, video, generative, agent, and robot learning teams to integrate RL into the full autonomy stack
Build scalable training systems for RL, including distributed rollouts, simulation infrastructure, and experiment management
Design evaluation frameworks to measure policy performance, stability, and generalization

Requirements

Experience developing and applying reinforcement learning algorithms in complex environments
Strong understanding of RL fundamentals (e.g., policy optimization, value methods, model-based RL)
Experience training policies in simulation and/or real-world systems
Proficiency in Python and deep learning frameworks such as PyTorch
Experience with large-scale experimentation and distributed training systems
Strong experimental rigor and ability to diagnose and improve learning systems
Solid software engineering skills and ability to build scalable, reliable systems
Ability to operate independently and drive ambiguous, high-impact technical problems

Bonus Qualifications

Experience applying RL to robotics, control systems, or embodied AI
Experience with large-scale RL infrastructure (distributed rollouts, simulation at scale)
Background in offline RL, imitation learning, or hybrid learning approaches
Experience with reward modeling or human-in-the-loop learning
Experience at leading AI labs such as OpenAI, Google DeepMind, Anthropic, or xAI
Familiarity with robotics systems, simulation environments, or real-world deployment constraints
Publication record in reinforcement learning, machine learning, or robotics