About Black Forest Labs

We’re the team behind Latent Diffusion, Stable Diffusion, and FLUX—foundational technologies that changed how the world creates images and video. We’re creating the generative models that power how people make images and video—tools used by millions of creators, developers, and businesses worldwide. Our FLUX models are among the most advanced in the world, and we’re just getting started.

Headquartered in Freiburg, Germany with a growing presence in San Francisco, we’re scaling fast while staying true to what makes us different: research excellence, open science, and building technology that expands human creativity.

Why This Role

You'll train large-scale diffusion models for image and video generation, exploring new approaches while maintaining the rigor that helps us distinguish meaningful progress from incremental tweaks. This isn't about following established recipes—it's about running the experiments that clarify which architectural choices matter and which are less impactful.

What You’ll Work On

Trains large-scale diffusion transformer models for image and video data, working at the scale where intuitions break and empirical evidence matters
Rigorously ablates design choices—running experiments that isolate variables, control for confounds, and produce insights you can actually trust—then communicating those results to shape our research direction
Reasons about the speed-quality tradeoffs of neural network architectures in production settings where both constraints matter simultaneously
Fine-tunes diffusion models for specialized applications like image and video upscalers, inpainting/outpainting models, and other tasks where general-purpose models aren't enough

What We’re Looking For

You've trained large-scale diffusion models and developed strong intuitions about what matters. You know that at research scale, every design choice has tradeoffs, and the only way to know which ones are worth making is through careful ablation. You're comfortable debugging distributed training issues and presenting research findings to the team.

You likely have:

Hands-on experience training large-scale diffusion models for image and video data, with practical knowledge of common failure modes and what matters most in training
Experience fine-tuning diffusion models for specialized applications—upscalers, inpainting, outpainting, or other tasks where understanding the domain matters as much as understanding the architecture
Deep understanding of how to effectively evaluate image and video generative models—knowing which metrics correlate with quality and which are just convenient proxies
Strong proficiency in PyTorch, transformer architectures, and the full ecosystem of modern deep learning
Solid understanding of distributed training techniques—FSDP, low precision training, model parallelism—because our models don't fit on one GPU and training decisions impact research outcomes

We'd be especially excited if you:

Have experience writing forward and backward Triton kernels and ensuring their correctness while considering floating point errors
Bring proficiency with profiling, debugging, and optimizing single and multi-GPU operations using tools like Nsight or stack trace viewers
Know the performance characteristics of different architectural choices at scale
Have published research that contributed to how people think about generative models

How We Work Together

We’re a distributed team with real offices that people actually use. Depending on your role, you’ll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We’ll cover reasonable travel costs to make this possible. We think in-person time matters, and we’ve structured things to make it accessible to all. We’ll discuss what this will look like for the role during our interview process.

Everything we do is grounded in four values:

Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.
Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.
Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.
Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

If this sounds like work you’d enjoy, we’d love to hear from you.

About Black Forest Labs

Why This Role

What You’ll Work On

Trains large-scale diffusion transformer models for image and video data, working at the scale where intuitions break and empirical evidence matters
Rigorously ablates design choices—running experiments that isolate variables, control for confounds, and produce insights you can actually trust—then communicating those results to shape our research direction
Reasons about the speed-quality tradeoffs of neural network architectures in production settings where both constraints matter simultaneously
Fine-tunes diffusion models for specialized applications like image and video upscalers, inpainting/outpainting models, and other tasks where general-purpose models aren't enough

What We’re Looking For

You likely have:

Hands-on experience training large-scale diffusion models for image and video data, with practical knowledge of common failure modes and what matters most in training
Experience fine-tuning diffusion models for specialized applications—upscalers, inpainting, outpainting, or other tasks where understanding the domain matters as much as understanding the architecture
Deep understanding of how to effectively evaluate image and video generative models—knowing which metrics correlate with quality and which are just convenient proxies
Strong proficiency in PyTorch, transformer architectures, and the full ecosystem of modern deep learning
Solid understanding of distributed training techniques—FSDP, low precision training, model parallelism—because our models don't fit on one GPU and training decisions impact research outcomes

We'd be especially excited if you:

Have experience writing forward and backward Triton kernels and ensuring their correctness while considering floating point errors
Bring proficiency with profiling, debugging, and optimizing single and multi-GPU operations using tools like Nsight or stack trace viewers
Know the performance characteristics of different architectural choices at scale
Have published research that contributed to how people think about generative models

How We Work Together

Everything we do is grounded in four values:

Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.
Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.
Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.
Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

If this sounds like work you’d enjoy, we’d love to hear from you.

Member of Technical Staff - Image / Video Generation

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About Black Forest Labs

Why This Role

What You’ll Work On

What We’re Looking For

How We Work Together

About Black Forest Labs

Why This Role

What You’ll Work On

What We’re Looking For

How We Work Together