← Back to Robotics
cs.RO

Do robots learn better by breaking tasks into sub-skills?

Anya Singh, Cabrel Happi, Jai Relan, Varun Nair, Vidyut Baradwaj

May 29, 2026

Vision-language-action policies struggle to learn new tasks without expensive fine-tuning. Researchers trained two VLA architectures on assembly data using either raw trajectories or primitive-segmented episodes (broken into sub-skills), then tested few-shot transfer on held-out tasks using only 0–10 demonstrations. Primitive-trained models hit 78% of fine-tuned performance with 3 demos; flat-trained models needed 10. Ablating the primitive-decodable subspace of hidden states dropped transfer by 32 points, proving primitives are causally necessary, not coincidental.
Published as Primitive Subspaces Mediate Few-Shot Transfer in VLAs arXiv:2605.30695
Read the original paper →