Applied Scientist, Safe Rl, Robotics, Saf Lab

Amazon Amazon · Big Tech · Pasadena, CA · Applied Science

This role focuses on developing and deploying safe reinforcement learning (RL) policies for dynamic legged locomotion on physical robots. It involves creating RL architectures that interface with physics models, internalize safety constraints during training, and transfer policies from simulation to real-world hardware. The work sits at the intersection of safety-critical control and learning, aiming to enable robots to operate safely around humans.

What you'd actually do

  1. Design, train, and deploy reinforcement learning (RL) policies for dynamic legged locomotion including walking, running, stair climbing, and fall recovery on physical robots
  2. Develop sim-to-real transfer pipelines that produce policies robust to the reality gap, including domain randomization, system identification, and adaptive strategies
  3. Integrate control-based methods with RL, as inputs to the RL (dynamic retargeting and control-guided rewards), in training (internalizing safety constraints in training), and as the RL feeds into safety layers and whole-body control
  4. Develop and maintain large-scale training infrastructure for locomotion policy learning, including physics simulation environments, domain randomization and GPU parallelization
  5. Evaluate policy performance rigorously through simulation benchmarks, hardware experiments, and failure-mode analysis

Skills

Required

  • PhD in Computer Science, Robotics, Mechanical Engineer, Electrical Engineering, or a related field with a focus on reinforcement learning, robot learning, or control
  • Experience applying RL to physical robotic systems (beyond simulation-only work), including demonstrated expertise in sim-to-real transfer on dynamically stable robots
  • Strong understanding of legged robot dynamics, contact mechanics, and whole-body control fundamentals
  • Proficiency in Python and deep learning frameworks (e.g., PyTorch, JAX) with experience building custom RL training pipelines
  • Experience with physics simulators for robotics (e.g., Isaac Gym/Sim, MuJoCo, PyBullet)
  • Experience in patents or publications at top-tier peer-reviewed conferences or journals

Nice to have

  • Experience in professional software development
  • Knowledge of safety-critical control, including control barrier functions and safety filters.
  • Familiarity with safety-constrained RL (e.g., constrained MDPs, Lagrangian methods, shielding, CBF-based policy filtering)
  • Experience with model-based control (MPC, whole-body QP controllers, operational space control) and how to interface these methods with RL
  • Knowledge of stability theory (Lyapunov methods, orbital stability) as it applies to periodic gaits
  • Experience with hierarchical RL, skill composition, distillation, and multi-task policy architectures for locomotion
  • Familiarity with real-time deployment constraints (latency budgets, onboard compute limitations, control-loop frequencies)
  • Experience building or contributing to large-scale RL training infrastructure (distributed training, GPU clusters)
  • Strong communication skills and ability to work across disciplinary boundaries (ML, controls, mechanical engineering)

What the JD emphasized

  • physical hardware
  • safety constraints
  • physical robots
  • safety-critical control
  • physical robotic systems
  • legged robot dynamics
  • safety-constrained RL
  • real-time deployment constraints

Other signals

  • develop RL architectures
  • sim-to-real transfer
  • physical robots
  • safety constraints