Member of Technical Staff - Image / Video Generation

Black Forest Labs Black Forest Labs · Multimodal · Freiburg · Research

Research role focused on training and fine-tuning large-scale diffusion models for image and video generation, involving rigorous experimentation, ablation studies, and understanding speed-quality tradeoffs in production settings.

What you'd actually do

  1. Trains large-scale diffusion transformer models for image and video data, working at the scale where intuitions break and empirical evidence matters
  2. Rigorously ablates design choices—running experiments that isolate variables, control for confounds, and produce insights you can actually trust—then communicating those results to shape our research direction
  3. Reasons about the speed-quality tradeoffs of neural network architectures in production settings where both constraints matter simultaneously
  4. Fine-tunes diffusion models for specialized applications like image and video upscalers, inpainting/outpainting models, and other tasks where general-purpose models aren't enough

Skills

Required

  • PyTorch
  • transformer architectures
  • modern deep learning ecosystem
  • distributed training techniques (FSDP, low precision training, model parallelism)

Nice to have

  • writing forward and backward Triton kernels
  • profiling, debugging, and optimizing single and multi-GPU operations
  • performance characteristics of different architectural choices at scale
  • published research on generative models

What the JD emphasized

  • large-scale diffusion models
  • rigorously ablates design choices
  • fine-tuning diffusion models for specialized applications
  • effectively evaluate image and video generative models
  • distributed training techniques

Other signals

  • training large-scale diffusion models
  • exploring new approaches
  • rigorously ablates design choices
  • fine-tunes diffusion models for specialized applications