← Back to Computation and Language
cs.CL

Can random text trick language models into wrong answers?

Pawel Batorski, Abtin Pourhadi, Jerzy Sarosiek, Przemyslaw Spurek, Paul Swoboda

May 28, 2026

LLMs respond not just to task instructions but to semantically irrelevant text—"spurious prompts." Researchers discovered these random additions can improve performance, sometimes outperforming carefully tuned prompts, while also reliably steering models to produce biased answers (always picking option A, returning even numbers) without explicit instruction. Tested on models from 0.8B to 27B parameters, this vulnerability suggests LLMs exploit shallow statistical patterns rather than understanding task semantics.
Published as Spurious Prompts: Can Irrelevant Prompts Steer Large Language Models? arXiv:2605.29678
Read the original paper →