← Back to Computer Vision
cs.CV

Reconstructing 3D scenes from unmatched thermal and RGB photos

Jean Cordonnier, Chenghao Xu, Olga Fink, Malcolm Mielle

June 3, 2026

Combining thermal and RGB imagery for 3D scene reconstruction typically requires precisely aligned camera pairs—impractical for real deployment. This work trains a transformer to estimate camera poses independently for each modality, then aligns them without calibration using feature matching and the Procrustes algorithm. The resulting 3D Gaussian splatting model learns from unpaired images, achieving competitive thermal synthesis while preserving RGB quality. The team also identifies that existing methods produce modality-specific reconstructions that lack coherence across thermal-RGB space, and contributes a benchmark to measure this cross-modal consistency.
Published as Unpaired RGB-Thermal Gaussian-Splatting Using Visual Geometric Transformers arXiv:2606.05491
Read the original paper →