← Back to Computer Vision cs.CV
Reconstructing 3D scenes from unmatched thermal and RGB photos
Jean Cordonnier, Chenghao Xu, Olga Fink, Malcolm Mielle
June 3, 2026
Combining thermal and RGB imagery for 3D scene reconstruction typically requires precisely aligned camera pairs—impractical for real deployment. This work trains a transformer to estimate camera poses independently for each modality, then aligns them without calibration using feature matching and the Procrustes algorithm. The resulting 3D Gaussian splatting model learns from unpaired images, achieving competitive thermal synthesis while preserving RGB quality. The team also identifies that existing methods produce modality-specific reconstructions that lack coherence across thermal-RGB space, and contributes a benchmark to measure this cross-modal consistency.
Read the original paper →