← Back to Computation and Language cs.CL
How to fix facts in vision-language models without breaking everything else?
Leijiang Gu, Zhen Zeng, Feng Li, Xinjian Gao, Zenglin Shi
May 28, 2026
When you fix a factual error in a multimodal language model, the change often stays trapped in that single example or accidentally corrupts unrelated knowledge. LDKE solves this by identifying which specific layers control each fact and separating target-relevant features from irrelevant ones. The method uses fast localization to find critical layers and a classifier to route information properly, achieving both broad generalization and high precision across multiple benchmarks.
Read the original paper →