← Back to Robotics
cs.RO

Why robots struggle when multiple good actions look identical

Lorenzo Mazza, Massimiliano Datres, Ariel Rodriguez, Sebastian Bodenstedt, Gitta Kutyniok, Stefanie Speidel

May 21, 2026

Teaching robots by imitation fails when the same visual state allows many valid actions. This work maps exactly how two common policy designs break under multimodality: latent-variable models either collapse to a single mode or lose the ability to distinguish between them, while generative models struggle because smooth mappings cannot cover many separated solutions. Testing on robotic tasks confirms these failure modes are fundamental, not accidental.
Published as Understanding Multimodal Failure in Action-Chunking Behavioral Cloning arXiv:2605.22493
Read the original paper →