← Back to Computer Vision
cs.CV

Can robots understand and remember 3D scenes over time?

Luzhou Ge, Xiangyu Zhu, Jinyan Liu, Xuesong Li

May 28, 2026

Robots need to understand scenes that change over time—objects move, appear, disappear—while mapping semantic meaning to 3D space. DGSG-Mind combines 3D Gaussian splatting with scene graphs to track object instances robustly and build a spatial-semantic memory the robot can reason about. It beats competitors on zero-shot 3D visual grounding and works on real robots, handling dynamic updates without retraining.
Published as DGSG-Mind: Dynamic 3D Gaussian Scene Graphs for Long-Term Scene Understanding and Grounding arXiv:2605.29879
Read the original paper →