Tracking clothing wrinkles and body movements from video alone

Existing human pose models capture skeleton movement but miss clothing deformation; generic scene flow fails on articulated bodies. H-Flow predicts dense pixel-level motion of both skeletal pose and surface deformation from monocular video using physics-inspired losses—geometric, structural, and biomechanical constraints—instead of expensive ground-truth labels. The team also released DynAct4D, a synthetic benchmark with dense flow annotations. Results beat both scene-flow and parametric baselines and generalize to unconstrained video without retraining.