← Back to Computer Vision cs.CV
Detecting what changed in 3D scenes without retraining
Wei Zhang, Songhua Li, Yihang Wu, Qiang Li, Qi Wang
May 16, 2026
3D change detection from multi-temporal images requires precise alignment of reconstructions across different epochs—a task complicated by scale ambiguity, depth noise, and the paradox that scene changes themselves corrupt registration. VGGT-CD solves this by decoupling cross-temporal registration from dynamic-change interference using a two-stage pipeline: coarse-stage keyframe joint inference establishes a shared metric space and Sim(3) prior; fine-stage purification isolates static backgrounds and refines alignment via closed-form centroid optimization. Tested on 11 scenes from the World Across Time dataset, the method reduces Absolute Trajectory Error by 44% outdoors and 59% indoors while completing registration 6× faster, with no task-specific training required.
Read the original paper →