Senior Research Engineer - Data

Synthesia Synthesia · Multimodal · EUROPE · Research and Development

Synthesia is seeking a Senior Research Engineer focused on data to manage the lifecycle of data for AI researchers. This role involves sourcing, processing, and delivering datasets that power their generative AI models, with a focus on video, image, and audio data. The position is at the intersection of applied research, data engineering, and ML infrastructure, emphasizing data quality and curation to improve model performance.

What you'd actually do

  1. collaborating closely with our model training teams
  2. extract new features and annotations that elevate our datasets
  3. enhancing model performance through high-quality, accurate datasets
  4. influence the team’s longer-term strategy

Skills

Required

  • data-centric, applied Machine Learning
  • improving model performance through data quality, curation, labeling, and evaluation
  • Generative AI data layer experience (images, video, audio)
  • Python
  • clean, maintainable, and well-tested code
  • designing, building, and operating workflow orchestration systems
  • large-scale data processing pipelines

Nice to have

  • ML infrastructure

What the JD emphasized

  • hands-on experience improving model performance through data quality, curation, labeling, and evaluation rather than model architecture alone
  • Experience working on the data layer of Generative AI products, particularly involving images, video, or audio
  • Hands-on experience designing, building, and operating workflow orchestration systems and large-scale data processing pipelines

Other signals

  • data-centric ML
  • generative AI data
  • workflow orchestration
  • large-scale data processing