Staff Software Engineer, Voices

Synthesia Synthesia · Multimodal · EUROPE · Engineering

Staff Software Engineer focused on the core systems for Synthesia's AI video platform, specifically script preview and voice generation. The role involves building and operating backend services that orchestrate multiple TTS providers and in-house models, ensuring reliability, consistency, and quality. Responsibilities include designing persistence systems, improving voice data handling, and working on voice discovery/recommendation features, including evaluation and observability systems. The role requires strong backend engineering skills, a product mindset, and collaboration with R&D and frontend teams.

What you'd actually do

  1. You will work on the core systems powering Synthesia’s script preview and voice generation experience, ensuring users can reliably generate high-quality voiceovers across a wide range of languages, providers, and use cases.
  2. You will build and operate backend services that orchestrate multiple text-to-speech (TTS) providers, alongside Synthesia’s in-house models, delivering a seamless and consistent experience to end users despite underlying system complexity.
  3. You will be responsible for designing and evolving systems that handle provider reliability, request routing, and output consistency, ensuring users can generate and regenerate voice content with predictable, high-quality results.
  4. You will contribute to user-facing product problems from a backend perspective, working closely with frontend engineers to ensure APIs and workflows integrate cleanly into the product experience.
  5. You will own projects that span multiple systems and domains, such as: Building robustness layers (retries, throttling, failover) to handle unreliable third-party providers, Designing persistence systems to ensure consistent voice outputs across generations, Improving how voice data is stored, retrieved, and reused.

Skills

Required

  • several years of experience building and operating backend systems in production
  • strong backend engineer (Python/FastAPI)
  • designing reliable, observable services
  • working with third-party APIs or distributed systems
  • product mindset
  • solving user-facing problems
  • working close to the client
  • understanding how APIs are consumed
  • understanding how backend decisions impact the end-user experience
  • willing to step outside your comfort zone
  • jumping into frontend code when needed
  • working in an iterative, experiment-driven environment
  • shipping quickly and improving based on data and feedback

Nice to have

  • experience with observability tools (e.g. Datadog)
  • experience with workflow systems (e.g. Temporal)
  • experience with evaluation/recommendation systems
  • touching the frontend in any language and/or framework

What the JD emphasized

  • core systems
  • script preview and voice generation experience
  • orchestrate multiple text-to-speech (TTS) providers
  • in-house models
  • provider reliability
  • output consistency
  • voice discovery and recommendations
  • evaluation systems
  • observability
  • backend systems in production
  • reliable, observable services
  • user-facing problems
  • end-user experience
  • iterative, experiment-driven environment

Other signals

  • AI video platform
  • text-to-speech (TTS) providers
  • in-house models
  • voice discovery and recommendations
  • evaluation systems