Head of Global Compute Supply & Platform Strategy

Luma AI Luma AI · AI Frontier · SF Bay Area, CA · Research

Head of Global Compute Supply & Platform Strategy for a robotics foundation model company. This role is responsible for the end-to-end global compute footprint, including capacity strategy, capital allocation, and systems architecture. The goal is to design a scaling roadmap to ensure research and robotics teams have the necessary compute resources to ship frontier world models. The role involves leading infrastructure, distributed systems, and datacenter operations teams, maximizing fleet utilization, managing large capital budgets, and serving as the primary interface with compute vendors.

What you'd actually do

  1. Architect Multi-Year Compute Strategy: Lead capacity planning, global vendor and cloud partnerships, on-prem vs. cloud mix, and accelerator supply chain roadmaps (H/B-series GPUs, custom silicon evaluation).
  2. Direct the Platform Org: Provide strategic leadership to our infrastructure, distributed systems, and datacenter operations teams—scaling the organization to support next-generation compute demands.
  3. Maximize Fleet Utilization: Oversee the architectural efficiency of our cluster configurations to deliver >50% Model Flops Utilization (MFU) on flagship training runs.
  4. Command a Megawatt Budget: Negotiate, secure, and operate our largest-scale capital deployments for compute infrastructure, partnering directly with Finance to optimize unit economics and risk management.
  5. Unify Global Capacity: Champion the platform strategy that enables world-model training, heavy simulation rollouts, and real-time on-robot inference to seamlessly share a single, elastic fleet.

Skills

Required

  • Compute Strategy
  • Capacity Planning
  • Vendor Partnerships
  • Cloud Partnerships
  • On-Prem vs. Cloud Mix
  • Accelerator Supply Chain
  • Custom Silicon Evaluation
  • Infrastructure Leadership
  • Distributed Systems
  • Datacenter Operations
  • Fleet Utilization Optimization
  • Cluster Configurations
  • Capital Deployment
  • Budget Management
  • Unit Economics
  • Risk Management
  • World-Model Training
  • Simulation Rollouts
  • On-Robot Inference
  • Elastic Fleet Management
  • High-Performance Cluster Topology
  • High-Speed Interconnects (InfiniBand/RoCE)
  • Large-Scale Data Systems
  • Distributed Training Architectures
  • 10k+ Accelerator Environments
  • High-Performance Production Settings

Nice to have

  • Scale Credentials (>100B-parameter or >100k-GPU-day scale)
  • Robotics/Autonomy Context
  • Edge-to-Cloud Inference
  • Real-Time Autonomous Systems

What the JD emphasized

  • 10+ years of engineering leadership experience in large-scale distributed systems, infrastructure, or technical supply chain, with a proven track record of leading compute platform strategy at a frontier AI lab, hyperscaler, or major autonomy program.
  • Deep technical & commercial fluency in high-performance cluster topology, high-speed interconnects (InfiniBand/RoCE), large-scale data systems, and the economics of distributed training architectures.
  • Direct operational oversight of 10k+ accelerator environments in high-performance production settings.

Other signals

  • owns Luma’s global compute footprint end-to-end
  • design our scaling roadmap from the silicon up
  • turning capital into capability
  • Architect Multi-Year Compute Strategy
  • Direct the Platform Org
  • Maximize Fleet Utilization
  • Command a Megawatt Budget
  • Unify Global Capacity
  • Act as Principal Executive Interface