← Back to Computer Vision cs.CV
Tracking both hands in video when they're hidden or out of frame
Huajian Zeng, Chaohua Yao, Yuantai Zhang, Jiaqi Yang, Rolandos Alexandros Potamias, Xingxing Zuo
May 18, 2026
Recovering precise 3D hand motion from first-person video is essential for robot learning, but hands frequently disappear from view or get hidden by objects. StableHand uses a flow-matching generative model that predicts per-frame reliability scores for hand observations, then intelligently weighs them during reconstruction—trusting high-confidence frames while inferring missing ones from learned hand-motion patterns. On two egocentric benchmarks with heavy occlusion and missing frames, it cuts position error by 20–25% versus prior methods.
Read the original paper →