Senior Research Engineer - Data

at Synthesia · Multimodal · EUROPE · Research and Development

Synthesia is seeking a Senior Research Engineer focused on data to manage the lifecycle of data for AI researchers. This role involves sourcing, processing, and delivering datasets that power their generative AI models, with a focus on video, image, and audio data. The position is at the intersection of applied research, data engineering, and ML infrastructure, emphasizing data quality and curation to improve model performance.

What you'd actually do

  1. collaborating closely with our model training teams
  2. extract new features and annotations that elevate our datasets
  3. enhancing model performance through high-quality, accurate datasets
  4. influence the team’s longer-term strategy

Skills

Required

  • data-centric, applied Machine Learning
  • improving model performance through data quality, curation, labeling, and evaluation
  • Generative AI data layer experience (images, video, audio)
  • Python
  • clean, maintainable, and well-tested code
  • designing, building, and operating workflow orchestration systems
  • large-scale data processing pipelines

Nice to have

  • ML infrastructure

What the JD emphasized

  • hands-on experience improving model performance through data quality, curation, labeling, and evaluation rather than model architecture alone
  • Experience working on the data layer of Generative AI products, particularly involving images, video, or audio
  • Hands-on experience designing, building, and operating workflow orchestration systems and large-scale data processing pipelines

Other signals

  • data-centric ML
  • generative AI data
  • workflow orchestration
  • large-scale data processing
Read full job description

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US.

As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations.

Following our recent Series E funding round, where we raised $200 million, our valuation stands at $4 billion. Our total funding exceeds $530 million from premier investors including Accel, NVentures (Nvidia's VC arm), Kleiner Perkins, GV, and Evantic Capital, alongside the founders and operators of Stripe, Datadog, Miro, and Webflow.

About the role

The Data team manages the complete lifecycle of data for researchers - from sourcing and large-scale processing to delivering datasets that power our models. Data sits at the heart of our Research efforts and enables all other teams. As part of the Data team, you’ll work with over a million hours of video and audio data.

This role exists at the intersection of applied research, data engineering, and ML infrastructure rather than being a traditional research position.

ou’ll build the world’s best human-centric data lake by collaborating closely with our model training teams. By understanding their requirements, you’ll extract new features and annotations that elevate our datasets. You should be passionate about enhancing model performance through high-quality, accurate datasets. Our infrastructure and pipelines are in great shape, and this role provides room to not only enhance them but also influence the team’s longer-term strategy.

What we're looking for:

  • A strong background in data-centric, applied Machine Learning, with hands-on experience improving model performance through data quality, curation, labeling, and evaluation rather than model architecture alone
  • Experience working on the data layer of Generative AI products, particularly involving images, video, or audio
  • Excellent Python skills, with a strong focus on writing clean, maintainable, and well-tested code
  • Hands-on experience designing, building, and operating workflow orchestration systems and large-scale data processing pipelines

Why join us?

We’re living the golden age of AI. The next decade will yield the next iconic companies, and we dare to say we have what it takes to become one. Here’s why,

Our culture

At Synthesia we’re passionate about building, not talking, planning or politicising. We strive to hire the smartest, kindest and most unrelenting people and let them do their best work without distractions. Our work principles serve as our charter for how we make decisions, give feedback and structure our work to empower everyone to go as fast as possible. You can find out more about these principles here.

Serving 50,000+ customers (and 50% of the Fortune 500)

We’re trusted by leading brands such as Heineken, Zoom, Xerox, McDonald’s and more. Read stories from happy customers and what 1,200+ people say on G2.

Proprietary AI technology

Since 2017, we’ve been pioneering advancements in Generative AI. Our AI technology is built in-house, by a team of world-class AI researchers and engineers. Learn more about our AI Research Lab and the team behind.

AI Safety, Ethics and Security

AI safety, ethics, and security are fundamental to our mission. While the full scope of Artificial Intelligence's impact on our society is still unfolding, our position is clear: People first. Always. Learn more about our commitments to AI Ethics, Safety & Security.

The good stuff...

  • Competitive compensation (salary + stock options + bonus)
  • Hybrid work setting with an office in London, Amsterdam, Zurich, Munich, or remote in Europe.
  • 25 days of annual leave + public holidays
  • Great company culture with the option to join regular planning and socials at our hubs
    • other benefits depending on your location

#LI-MD1