← Back to Machine Learning cs.LG
Preventing AI agents from leaking secrets through shared memory
Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy
May 21, 2026
When AI agents coordinate through shared transformer key-value caches, they inadvertently expose sensitive information—not through text, but through the hidden representations themselves. LCGuard learns to transform these cache artifacts before sharing, using adversarial training where one model tries to reconstruct leaked secrets while another blocks it. The approach cuts information leakage substantially while maintaining task performance across multiple model families and benchmarks.
Read the original paper →