← Back to Artificial Intelligence cs.AI
Why AI knowledge edits fail on slightly different images or phrasings
Haoyuan Wang, Xiaohao Liu, Jiajie Su, Jianmao Xiao, Chaochao Chen
May 22, 2026
Multimodal language models can be edited to correct facts, but changes often don't generalize—they fail when the same concept appears in different images or worded differently. This work treats robust editing as a group problem: identify semantically equivalent inputs and ensure predictions stay consistent across them. Two techniques do the work: generating adversarial variants in the latent space to find brittle regions, then enforcing low-rank alignment at the edit layer to stabilize them. Results show substantial improvements in generalization without degrading existing knowledge.
Read the original paper →