Backdoor attacks that hide in position, not text

This paper reveals that Transformer-based LLMs are vulnerable to backdoor attacks that exploit positional information instead of modifying textual content. MetaBackdoor uses length-correlated triggers to activate hidden behaviors—including disclosure of system prompts and unintended tool calls—without any suspicious text in the input. The attack works on semantically clean inputs and can compose with content-based backdoors for compound triggers. Existing defenses focused on detecting malicious text cannot detect these position-based attacks, exposing a previously unexamined vulnerability in how modern LLMs process sequential information.