← Back to Machine Learning
cs.LG

Hiding malware in AI models that only activates after compression

Xiaohua Zhan, Kazuki Egashira, Robin Staab, Mark Vero, Martin Vechev

May 14, 2026

LLM quantization reduces memory requirements but creates a hidden attack surface: a model can be released appearing harmless at full precision, then behave maliciously once compressed by end users. Previous attacks only worked against simple quantization schemes and failed against widely-used methods like AWQ, GPTQ, and GGUF. This work exploits a property common to modern quantization: injecting large outlier values into specific weight blocks causes surrounding weights to collapse to zero during quantization, producing predictable, targeted behavioral changes. Evaluated across three attack scenarios, the method achieves high success rates where all prior attacks failed, showing that quantization security risks extend to the most popular deployment pipelines.
Published as Widening the Gap: Exploiting LLM Quantization via Outlier Injection arXiv:2605.15152
Read the original paper →