← Back to Machine Learning cs.LG
What if robots learned from every moment, not just successes?
Michael Matthews, Matthew Jackson, Michael Beukman, Thomas Foster, Alistair Letcher, Scott Fujimoto, Cédric Colas, Jakob Foerster
May 22, 2026
Goal-conditioned agents typically waste most observations by updating only toward the commanded goal. This work enables "all-goals learning"—using every transition to improve performance on every possible objective—by having a single neural network jointly output values and actions for all goals in parallel. On Craftax environments, LEO dramatically outperforms competitors while running 250× faster than naive relabelling; it also matches or beats existing methods on continuous control. Code is released.
Read the original paper →