← Back to Computer Vision
cs.CV

Can multimodal prompts make change detection work across any landscape?

Chenhao Sun

May 28, 2026

Satellite change detection—spotting urban growth, disaster damage, or environmental shifts—fails when trained on one region and tested on another. OmniCD combines image and text prompts (descriptions, semantic maps, metadata) in a single system that handles everything from simple before-after comparisons to understanding *what* changed semantically. The team also released RSITCD, a 300K+ multimodal dataset, and demonstrated stronger generalization and robustness than prior methods across multiple benchmarks.
Published as OmniCD: A Foundational Framework for Remote Sensing Image Change Detection Guided by Multimodal Semantics arXiv:2605.30168
Read the original paper →