← Back to Artificial Intelligence
cs.AI

Why AI knowledge edits fail on slightly different images or phrasings

Haoyuan Wang, Xiaohao Liu, Jiajie Su, Jianmao Xiao, Chaochao Chen

May 22, 2026

Multimodal language models can be edited to correct facts, but changes often don't generalize—they fail when the same concept appears in different images or worded differently. This work treats robust editing as a group problem: identify semantically equivalent inputs and ensure predictions stay consistent across them. Two techniques do the work: generating adversarial variants in the latent space to find brittle regions, then enforcing low-rank alignment at the edit layer to stabilize them. Results show substantial improvements in generalization without degrading existing knowledge.
Published as Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment arXiv:2605.23780
Read the original paper →