← Back to Computer Vision
cs.CV

Editing videos in seconds without retraining the model

Guanlong Jiao, Chenyangguang Zhang, Jia Jun Cheng Xian, Zewei Zhang, Renjie Liao

May 20, 2026

Video editing typically requires many costly iterations to produce good results. StreamGVE flips the approach: instead of iteratively refining from the original video, it generates from noise while anchoring to the source footage—the same way modern image generators work. Using dual-branch sampling and attention mechanisms to blend source conditions with generation, the method delivers high-quality edits in minimal steps without retraining. Works across different pre-trained models and handles diverse editing tasks.
Published as StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation arXiv:2605.21466
Read the original paper →