← Back to Computer Vision cs.CV
Where should style injection happen in image generation?
Amey Sunil Kulkarni
May 26, 2026
Style transfer using diffusion models forces you to choose: strong style or faithful content. This work shows that choice is artificial. By varying how much style gets injected at different depths and timesteps—stronger early and shallow, weaker late and deep—you can improve both simultaneously. Testing 35 configurations across 28,000 images reveals cosine and square-root schedules consistently outperform uniform injection, and the gains stack when combined with ControlNet. Training-free, no new parameters needed.
Read the original paper →