← Back to Robotics cs.RO
How to make diffusion models plan farther ahead without exploding compute costs
Byoungwoo Park, Utkarsh A. Mishra, Jaemoo Choi, Juho Lee, Yongxin Chen
May 30, 2026
Diffusion models excel at generating short sequences, but extending them to long-horizon tasks breaks coherence—neighboring plans stay locally consistent yet form implausible global trajectories. CoFi separates this into two stages: first building a coarse structural scaffold capturing task-level arrangement, then refining details while preserving that scaffold. Across robotic manipulation, panoramic images, and long videos, it improves both global structure and sample quality while cutting denoiser calls by 2–8×.
Read the original paper →