Senior Research Engineer - Video Foundation Models (pre - Training)

Synthesia Synthesia · Multimodal · EUROPE · Research and Development

Research Engineer focused on pre-training foundation models for AI video generation, working on large-scale generative modeling, distributed systems, and production engineering. The role involves advancing training recipes, scaling distributed systems, improving evaluation frameworks, and optimizing inference for production deployment.

What you'd actually do

  1. Developing and scaling latent video diffusion models tailored for human-centric video generation
  2. Designing conditioning mechanisms to improve control (pose, emotion, script, camera) without sacrificing fidelity
  3. Advancing distributed training strategies (DDP, FSDP, DeepSpeed, sequence parallelism) under real compute constraints
  4. Improving training stability at multi-node scale
  5. Designing rigorous evaluation frameworks combining automated metrics and structured human evaluation

Skills

Required

  • Python
  • PyTorch
  • diffusion models
  • large scale multi-GPU / multi-node training
  • distributed training (DDP, FSDP, DeepSpeed or similar)
  • controlled experiments

Nice to have

  • video diffusion models
  • avatar or human-centric generation
  • world / interactive models
  • GANs or VAEs
  • optimizing inference systems for production

What the JD emphasized

  • training deep learning models at scale
  • diffusion models
  • large scale multi-GPU / multi-node training
  • distributed training
  • low latency, high resolution, and cost efficiency

Other signals

  • foundation models
  • video generation
  • large-scale training
  • distributed systems
  • production engineering