← Back to Robotics cs.RO
Can robots learn to fold clothes from mixed demonstrations?
Taiyi Su, Jian Zhu, Tianjian Wang, Youzhang He, Zitai Huang, Jianjun Zhang, Chong Ma, Hanyang Wang, Tianjiao Zhang, Munan Yin, Weihao Ding, Yi Xu
May 29, 2026
Household robots struggle with deformable objects like clothing because existing systems train separate policies for each item type. DeMaVLA combines a vision-language backbone with an efficient action expert (using pruned transformers and flow matching) and trains on 5,000 hours of real dual-arm demonstrations plus corrective trajectories from failed attempts. The result: a single policy that folds different clothing items across varying materials and scenes, validated on both simulation and real robot experiments.
Read the original paper →