← Back to Artificial Intelligence
cs.AI

How attackers secretly poison AI agents' memories through chat

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

May 28, 2026

LLM agents now store persistent memories to handle long-running tasks, but that memory becomes a target. Researchers built MemPoison, an attack that plants hidden triggers in agent memory through normal conversation—triggers that later cause the agent to behave maliciously. Unlike previous attacks that assume direct memory access, MemPoison works against real memory systems that filter and rewrite stored information, using tricks like semantic binding and entity disguise to sneak past defenses. Tests across multiple agent types achieved 95% success rates; existing defenses proved fundamentally inadequate.
Published as Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction arXiv:2605.29960
Read the original paper →