← Back to Computer Vision
cs.CV

Training radar to understand motion without expensive sensor labels

Jingyun Fu, Zhiyu Xiang, Na Zhao

May 18, 2026

4D radar scene flow—estimating how objects move in a scene—is hard to label, so current methods either guess poorly or require LiDAR sensors. This work uses just camera tracking and odometry as weak supervision: they extract instance masks from off-the-shelf 2D trackers, project them into 3D radar space for semantic guidance, and use vehicle motion to handle static regions. On the real-world VoD dataset, the approach outperforms both existing cross-modal supervised methods and fully supervised baselines while being cheaper to train. Code is released.
Published as Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation arXiv:2605.18507
Read the original paper →