← Back to Machine Learning
cs.LG

Preventing AI agents from leaking secrets through shared memory

Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy

May 21, 2026

When AI agents coordinate through shared transformer key-value caches, they inadvertently expose sensitive information—not through text, but through the hidden representations themselves. LCGuard learns to transform these cache artifacts before sharing, using adversarial training where one model tries to reconstruct leaked secrets while another blocks it. The approach cuts information leakage substantially while maintaining task performance across multiple model families and benchmarks.
Published as LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems arXiv:2605.22786
Read the original paper →