← Back to Computer Vision cs.CV
Training radar to understand motion without expensive sensor labels
Jingyun Fu, Zhiyu Xiang, Na Zhao
May 18, 2026
4D radar scene flow—estimating how objects move in a scene—is hard to label, so current methods either guess poorly or require LiDAR sensors. This work uses just camera tracking and odometry as weak supervision: they extract instance masks from off-the-shelf 2D trackers, project them into 3D radar space for semantic guidance, and use vehicle motion to handle static regions. On the real-world VoD dataset, the approach outperforms both existing cross-modal supervised methods and fully supervised baselines while being cheaper to train. Code is released.
Read the original paper →