← Back to Computer Vision
cs.CV

Tracking both hands in video when they're hidden or out of frame

Huajian Zeng, Chaohua Yao, Yuantai Zhang, Jiaqi Yang, Rolandos Alexandros Potamias, Xingxing Zuo

May 18, 2026

Recovering precise 3D hand motion from first-person video is essential for robot learning, but hands frequently disappear from view or get hidden by objects. StableHand uses a flow-matching generative model that predicts per-frame reliability scores for hand observations, then intelligently weighs them during reconstruction—trusting high-confidence frames while inferring missing ones from learned hand-motion patterns. On two egocentric benchmarks with heavy occlusion and missing frames, it cuts position error by 20–25% versus prior methods.
Published as StableHand: Quality-Aware Flow Matching for World-Space Dual-Hand Motion Estimation from Egocentric Video arXiv:2605.18553
Read the original paper →