← Back to Machine Learning
cs.LG

What if robots learned from every moment, not just successes?

Michael Matthews, Matthew Jackson, Michael Beukman, Thomas Foster, Alistair Letcher, Scott Fujimoto, Cédric Colas, Jakob Foerster

May 22, 2026

Goal-conditioned agents typically waste most observations by updating only toward the commanded goal. This work enables "all-goals learning"—using every transition to improve performance on every possible objective—by having a single neural network jointly output values and actions for all goals in parallel. On Craftax environments, LEO dramatically outperforms competitors while running 250× faster than naive relabelling; it also matches or beats existing methods on continuous control. Code is released.
Published as Goal-Conditioned Agents that Learn Everything All at Once arXiv:2605.23551
Read the original paper →