Can language models safely handle delicate robot touch tasks?

Vision-language models excel at understanding what a robot should do, but they output commands too slowly for precise contact tasks like inserting connectors. PaCo-VLA wraps the VLA in a high-frequency "passivity shield" that treats the model's outputs as suggestions—compliance targets, not direct motor commands—and enforces energy conservation to prevent damage. In real connector-insertion trials, this framework achieved zero safety violations while outperforming unshielded baselines, proving language models can handle delicate manipulation when wrapped in physics-aware guards.