Why diffusion models get stuck sampling from their own data

Hyunmo Kang, Noam Itzhak Levi, Corinna Elena Wegner, Daniel J. Korchinski, Matthieu Wyart

Sampling from learned distributions using diffusion models often gets stuck on fragmented parts of the data manifold. This work introduces U-turn chains—iterative forward-backward diffusion steps with Metropolis-Hastings corrections—and uncovers a sharp transition: tiny moves fail to explore globally, but larger U-turn magnitudes restore mixing. The effect appears across synthetic languages, natural images, and text, with high-level features (like semantic concepts in LLM representations) relaxing last only when noise is large enough. The findings suggest diffusion samplers face fundamental constraints from their learned geometry.