← Back to Robotics
cs.RO

Can robots learn to fix bad advice without rewriting the rulebook?

Luzhe Sun, Jingtian Ji, Haoran Chen, Jiawei Zhou, Matthew R. Walter

June 4, 2026

Robots often inherit flawed policies from pretrained models or human operators. GLOVES adapts these actions on-the-fly by learning a flow model that transports non-expert actions toward an expert distribution, then uses reverse-flow scoring to decide which actions to correct and which to trust. The result: improved task success while preserving the agent's original intent, trained on minimal expert data. Code and demos released.
Published as Flow-based Policy Adaptation without Policy Updates arXiv:2606.06461
Read the original paper →