← Back to Computation and Language
cs.CL

Can AI remember people in group chats?

Olukunle Owolabi

May 18, 2026

Memory systems designed for one-on-one conversations break down when deployed in group chat settings, where facts must be tied to shared history, group norms must be distinguished from individual exceptions, and membership changes must be tracked accurately. SocialMemBench provides the first systematic evaluation: 1,031 QA pairs across five types of social groups (close friends, family, communities) and multiple sizes, identifying five distinct failure modes from single-stream conflation to entity merging at scale. Evaluation of four open-source memory frameworks (Mem0, LangMem, Graphiti, Cognee) shows they cluster around 0.12–0.18 accuracy, far below both retrieval baselines and human reasoning performance. Even Gemini 2.5 Flash with full conversation context scores only 0.72 on small networks.
Published as SocialMemBench: Are AI Memory Systems Ready for Social Group Settings? arXiv:2605.17789
Read the original paper →