← Back to Computation and Language
cs.CL

Backdoor attacks that hide in position, not text

Rui Wen, Mark Russinovich, Andrew Paverd, Jun Sakuma, Ahmed Salem

May 14, 2026

This paper reveals that Transformer-based LLMs are vulnerable to backdoor attacks that exploit positional information instead of modifying textual content. MetaBackdoor uses length-correlated triggers to activate hidden behaviors—including disclosure of system prompts and unintended tool calls—without any suspicious text in the input. The attack works on semantically clean inputs and can compose with content-based backdoors for compound triggers. Existing defenses focused on detecting malicious text cannot detect these position-based attacks, exposing a previously unexamined vulnerability in how modern LLMs process sequential information.
Published as MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs arXiv:2605.15172
Read the original paper →