Product Manager, Voice

Sierra Sierra · AI Frontier · San Francisco, CA · Product

Product Manager for Voice at Sierra, focusing on real-time, human-quality AI conversations. This role involves defining the voice interaction model, building reliable real-time systems, owning the voice stack experience (ASR, TTS, LLMs, telephony), making voice measurable, and translating real-world usage into product direction. Requires experience with real-time systems, voice, or AI products, and shipping voice or real-time products.

What you'd actually do

  1. Define the voice interaction model - Shape how agents handle real-time conversations—turn-taking, interruptions, latency, tone, and recovery from errors. Design what “human-quality” voice interaction actually means in practice.
  2. Build reliable real-time systems - Work closely with engineering on streaming architectures, latency budgets, and failure handling. Voice is unforgiving—ensure agents respond quickly and consistently in production environments.
  3. Own the voice stack experience - Partner across ASR, TTS, LLMs, and telephony integrations to deliver a cohesive product. Help decide model choices, orchestration strategies, and how different components work together.
  4. Make voice measurable and improvable - Define how we evaluate voice agents: latency, interruption handling, resolution rate, and conversation quality. Build feedback loops that improve performance over time.
  5. Translate real-world usage into product direction - Work closely with customers deploying voice agents in production. Understand edge cases (noisy environments, accents, call flows) and turn them into product improvements.

Skills

Required

  • 3+ years of product management experience
  • meaningful exposure to real-time systems, voice, or AI products
  • Experience shipping voice or real-time products
  • Strong technical depth
  • Ability to engage deeply with engineers on system design (e.g., speech pipelines, streaming infra, telephony systems, reliability tradeoffs)
  • Experience working with AI systems
  • Familiarity with LLMs, speech-to-text, or text-to-speech systems and their limitations in production environments
  • Track record of 0→1 product development
  • Comfortable operating in ambiguous spaces and iterating quickly to reach product-market fit

What the JD emphasized

  • real-time
  • low latency
  • high reliability
  • natural turn-taking
  • messy, real-world interactions
  • zero-to-one
  • shipping voice or real-time products
  • streaming systems
  • synchronous interactions

Other signals

  • Define the voice interaction model
  • Build reliable real-time systems
  • Own the voice stack experience
  • Make voice measurable and improvable
  • Translate real-world usage into product direction