← Back to Artificial Intelligence
cs.AI

Catching bad robot moves before they happen

Zhen Sun, Yongjian Guo, Haoran Sun, Luqiao Wang, Wei Lu, Jiachi Ji, Shengzhe Ji, Junwu Xiong, Zhijun Meng

May 21, 2026

Robot systems using vision-language models often generate poor actions that cause failures or waste computation on world-model rollouts. Pre-VLA catches these bad actions before execution by predicting safety confidence and advantage scores, using a lightweight multimodal classifier trained with techniques to handle imbalanced data. On LIBERO benchmarks, it improved success rates by 7.8 percentage points, reduced execution steps, and ran in 184 ms per action—catching errors early rather than failing during physical execution.
Published as Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts arXiv:2605.22446
Read the original paper →