← Back to Robotics cs.RO
Do robot arms complete tasks safely, or just recklessly?
Jialiang Fan, Weizhe Xu, Oleg Sokolsky, Insup Lee, Fanxin Kong
May 30, 2026
SafeVLA-Bench evaluates whether robot manipulation policies actually execute safely, not just whether they reach the goal. The team added formal safety checks (Signal Temporal Logic specs) to existing benchmarks, measuring both unsafe successes and violation severity. Testing nine policies on LIBERO and RoboCasa-365 shows that high task completion masks serious problems: excessive contact, knocking over objects, and self-collision. Code and evaluation framework released.
Read the original paper →