What you'd actually do

Own the full post-training pipeline end to end — from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment

Advance techniques across the post-training stack: SFT, RLHF, RLAIF, DPO, preference learning, and reward modeling to align models with human intent and aesthetic judgment

Work across modalities: text-to-image, image editing, multi-reference, and video post-training

Build personalization and customization capabilities that let users adapt our models to their own creative style

Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid iteration across the research team

Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering

About Black Forest Labs

We're the team behind Latent Diffusion, Stable Diffusion, and FLUX — foundational technologies that changed how the world creates images and video. Our models power the tools used by millions of creators, developers, and businesses worldwide, and FLUX is among the most advanced generative systems in the world.

Headquartered in Freiburg, Germany with a growing presence in San Francisco, we're scaling fast while staying true to what makes us different: research excellence, open science, and building technology that expands human creativity.

Why This Role

Post-training is where a foundation model becomes a product. In this role, you'll own the post-training pipeline for our multimodal models end to end — from data strategy and reward modeling to preference optimization, distillation, and safety tuning — across image, editing, and video. You'll drive measurable gains in model quality, build the infrastructure that lets the whole research team iterate fast, and push the state of the art in what it means to align a generative model to human intent.

This is a Staff / Senior IC role. We're looking for someone who has shipped post-training for a frontier model before and wants to do it again.

What You'll Work On

Own the full post-training pipeline end to end — from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment
Advance techniques across the post-training stack: SFT, RLHF, RLAIF, DPO, preference learning, and reward modeling to align models with human intent and aesthetic judgment
Work across modalities: text-to-image, image editing, multi-reference, and video post-training
Build personalization and customization capabilities that let users adapt our models to their own creative style
Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid iteration across the research team
Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering

What We're Looking For

You've owned post-training for a frontier generative model through release (SFT, preference optimization (DPO or RLHF), distillation, safety tuning) with measurable quality wins on human prefs or standard benchmarks
Deep experience across the post-training stack, not just one slice: reward modeling, preference learning, RLHF/RLAIF, and personalization
Comfortable working across modalities: text-to-image, image editing, multi-reference, and ideally video
Strong PyTorch fluency; you write research code that others can build on
Experience with distillation (LADD, DMD, consistency models, or similar) or with building high-throughput eval pipelines is a strong plus
Bias toward shipping: measurable model-quality improvements that reach users, not just papers

How We Work Together

We’re a distributed team with real offices that people actually use. Depending on your role, you’ll either join us in Freiburg or SF at least 2 days a week (or one full week every other week), or work remotely with a monthly in-person week to stay connected. We’ll cover reasonable travel costs to make this possible. We think in-person time matters, and we’ve structured things to make it accessible to all. We’ll discuss what this will look like for the role during our interview process.

Everything we do is grounded in four values:

Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.
Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.
Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.
Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

If this sounds like work you’d enjoy, we’d love to hear from you.

About Black Forest Labs

Why This Role

This is a Staff / Senior IC role. We're looking for someone who has shipped post-training for a frontier model before and wants to do it again.

What You'll Work On

Own the full post-training pipeline end to end — from data curation and reward modeling through fine-tuning, preference optimization, distillation, safety tuning, evaluation, and deployment

Advance techniques across the post-training stack: SFT, RLHF, RLAIF, DPO, preference learning, and reward modeling to align models with human intent and aesthetic judgment

Work across modalities: text-to-image, image editing, multi-reference, and video post-training

Build personalization and customization capabilities that let users adapt our models to their own creative style

Design and maintain high-throughput fine-tuning and evaluation infrastructure to support rapid iteration across the research team

Identify quality and alignment gaps through rigorous evaluation, then close them through targeted research and engineering

What We're Looking For

You've owned post-training for a frontier generative model through release (SFT, preference optimization (DPO or RLHF), distillation, safety tuning) with measurable quality wins on human prefs or standard benchmarks

Deep experience across the post-training stack, not just one slice: reward modeling, preference learning, RLHF/RLAIF, and personalization

Comfortable working across modalities: text-to-image, image editing, multi-reference, and ideally video

Strong PyTorch fluency; you write research code that others can build on

Experience with distillation (LADD, DMD, consistency models, or similar) or with building high-throughput eval pipelines is a strong plus

Bias toward shipping: measurable model-quality improvements that reach users, not just papers

How We Work Together

Everything we do is grounded in four values:

Obsessed. We are a frontier research lab. The science has to be right, the understanding deep, the product beautiful.

Low Ego. The work speaks. The best idea wins, no matter who said it. Credit is shared. Nobody is above any task.

Bold. We take the ambitious bet. We ship, we do not wait for conditions to be perfect.

Kind. People over politics. We treat each other with genuine warmth. Agency without empathy creates chaos.

If this sounds like work you’d enjoy, we’d love to hear from you.

Member of Technical Staff - Post Training

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About Black Forest Labs

Why This Role

What You'll Work On

What We're Looking For

How We Work Together

About Black Forest Labs

Why This Role

What You'll Work On

What We're Looking For

How We Work Together