← Back to Computer Vision
cs.CV

A dataset of 3.4K videos reveals emotions hidden in tiny gestures

Chengyan Wang, Haoyu Chen, Hui Wei, Yueyi Yang, Yunquan Chen, Guoying Zhao

May 16, 2026

Micro-gestures—subtle, involuntary movements driven by emotion—offer untapped signals for affective computing, but lack large-scale datasets for training. This work introduces iMiGUE-3K, the first large-scale in-the-wild video dataset for micro-gesture analysis, collected from 332 professional tennis players' press interviews over seven years. The dataset contains 3.4K long video clips annotated with 32 micro-gesture classes. The authors propose MG-FMs, a discriminative foundation model trained with self-supervised learning, and establish five evaluation benchmarks: unsupervised, semi-supervised, and supervised micro-gesture recognition, plus retrieval and emotion recognition tasks. Experiments show micro-gesture analysis meaningfully improves emotion understanding beyond facial expressions and speech alone.
Published as iMiGUE-3K: A Large-Scale Benchmark for Micro-Gesture Analysis with Self-Supervised Learning arXiv:2605.17179
Read the original paper →