Helix AI Intern, Speech [winter/summer 2026]

Figure AI Figure AI · Robotics · AI - Helix Team

Figure AI is seeking an intern to work on the real-time speech pipeline for their humanoid robots, focusing on low-latency audio streaming, speech enhancement, and real-time speech understanding. The role involves supporting the development and testing of audio systems, integrating real-time communication frameworks, and assisting with AI-based speech components.

What you'd actually do

  1. Support the development and testing of real-time audio and speech streaming pipelines
  2. Contribute to the integration of low-latency, full-duplex audio systems using WebRTC or similar frameworks
  3. Assist in evaluating or deploying AI-based components that improve speech quality, intelligibility, or responsiveness
  4. Collaborate with AI, audio, and robotics engineers to enhance the reliability and performance of speech systems
  5. Help build tools for monitoring, debugging, and visualizing live audio and speech pipeline performance

Skills

Required

  • Python or C++
  • real-time communication frameworks (WebRTC, gRPC, or WebSockets)
  • digital audio fundamentals (sampling, latency, buffering, SNR, AEC)
  • machine learning concepts
  • deploying or using pre-trained models

Nice to have

  • audio ML frameworks (PyTorch, torchaudio, ONNX Runtime)
  • speech enhancement or ASR/TTS systems
  • asynchronous or multithreaded programming (asyncio, coroutines, or similar)
  • cloud or edge-based audio processing systems
  • humanoid robots
  • real-time human–robot communication

What the JD emphasized

  • real-time speech pipeline
  • low-latency
  • real-time

Other signals

  • real-time speech pipeline
  • low-latency audio streaming
  • speech enhancement
  • real-time speech understanding
  • humanoid robots