Helix AI Engineer, Training Infrastructure

Figure AI Figure AI · Robotics · HQ · AI - Helix Team

Figure AI is seeking an experienced Training Infrastructure Engineer to manage and enhance their training clusters, implement distributed training algorithms, and build tools for AI researchers working on humanoid robots. The role involves architecting scalable deep learning frameworks for massive robot datasets and optimizing model development cycles.

What you'd actually do

  1. Design, deploy, and maintain Figure's training clusters
  2. Architect and maintain scalable deep learning frameworks for training on massive robot datasets
  3. Work together with AI researchers to implement training of new model architectures at a large scale
  4. Implement distributed training and parallelization strategies to reduce model development cycles
  5. Implement tooling for data processing, model experimentation, and continuous integration

Skills

Required

  • Strong software engineering fundamentals
  • Python
  • PyTorch
  • managing HPC clusters for deep neural network training
  • building reliable backend systems

Nice to have

  • managing cloud infrastructure (AWS, Azure, GCP)
  • job scheduling / orchestration tools (SLURM, Kubernetes, LSF, etc.)
  • configuration management tools (Ansible, Terraform, Puppet, Chef, etc.)

What the JD emphasized

  • massive robot datasets
  • AI researchers
  • model architectures
  • training infrastructure

Other signals

  • training infrastructure
  • distributed training
  • massive robot datasets
  • AI researchers
  • model architectures
  • data processing
  • model experimentation