Making image generators prefer better outputs the right way

Getting image generators to prefer human-approved outputs requires different math than language models. The authors show standard DPO uses the wrong utility function for image generation—it's too aggressive. Linear-DPO swaps in a gentler linear utility and improves results on Stable Diffusion 1.5, SDXL, and SD3-Medium, making alignment work across both major generative model architectures.