← Back to Computer Vision cs.CV
Editing videos in seconds without retraining the model
Guanlong Jiao, Chenyangguang Zhang, Jia Jun Cheng Xian, Zewei Zhang, Renjie Liao
May 20, 2026
Video editing typically requires many costly iterations to produce good results. StreamGVE flips the approach: instead of iteratively refining from the original video, it generates from noise while anchoring to the source footage—the same way modern image generators work. Using dual-branch sampling and attention mechanisms to blend source conditions with generation, the method delivers high-quality edits in minimal steps without retraining. Works across different pre-trained models and handles diverse editing tasks.
Read the original paper →