← Back to Artificial Intelligence
cs.AI

Do language agents actually learn from past tasks?

Yiheng Shu, Bernal Jiménez Gutiérrez, Saisri Padmaja Jonnalagedda, Yuguang Yao, Huan Sun, Yu Su

June 1, 2026

Language agents solve tasks one at a time but rarely learn from them. This work introduces AgentCL, a benchmark with controlled task streams designed so earlier solutions genuinely transfer to later ones, plus MemProbe, a memory system that stores and filters agent insights. Testing on coding and research tasks shows existing memory designs barely improve performance on naive streams, but controlled streams expose their weaknesses—highlighting that agents need fundamentally better ways to balance learning new things without forgetting old ones.
Published as AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents arXiv:2606.02461
Read the original paper →