← Back to Computation and Language cs.CL
Making images unlearnable to prevent unauthorized model training
Chengshuai Zhao, Zhen Tan, Dawei Li, Zhiyuan Yu, Huan Liu
May 14, 2026
MMGuard protects multimodal data from unauthorized fine-tuning by embedding human-imperceptible perturbations that exploit how LVLMs learn. The perturbations create optimization shortcuts that cause models to overfit to noise during training, degrading performance on clean data at inference time. A cross-modal binding disruption technique further strengthens defense by enforcing spurious correlations between noise and targets. Tested against nine open-source LVLMs across six datasets, MMGuard provides protection under white-box, gray-box, and black-box threat models with transfer across different models via ensemble learning.
Read the original paper →