Memory makes AI agents less safe over time

Deployed LLM agents retain memory across many independent tasks, but safety evaluations typically measure only single-scenario performance. This work demonstrates that accumulated memory creates temporal contamination—violations that emerge only after many interactions. Using a trigger-probe protocol tested on three real-world scenarios and eight memory architectures (including OpenClaw), the authors show violation rates increase consistently with memory length, driven by content accumulation rather than encounter order. Critically, memory-induced risks appear detectable from retrieval state before generation, enabling prospective monitoring.