← Back to Robotics cs.RO
Can robots learn to fix bad advice without rewriting the rulebook?
Luzhe Sun, Jingtian Ji, Haoran Chen, Jiawei Zhou, Matthew R. Walter
June 4, 2026
Robots often inherit flawed policies from pretrained models or human operators. GLOVES adapts these actions on-the-fly by learning a flow model that transports non-expert actions toward an expert distribution, then uses reverse-flow scoring to decide which actions to correct and which to trust. The result: improved task success while preserving the agent's original intent, trained on minimal expert data. Code and demos released.
Read the original paper →