← Back to Computation and Language
cs.CL

Why your AI simulation experiments might be misleading

Victoria Lin, Taedong Yun, Maja Matarić, John Canny, Arthur Gretton, Alexander D'Amour

May 20, 2026

LLMs trained on real-world data carry hidden biases that shift when you intervene experimentally. Asking an LLM-simulated user how they'd respond to a treatment can subtly alter their demographics or preferences in ways that bias your effect estimates. The authors show how to catch this problem using negative controls—outcomes that shouldn't change—and fix it by specifying confounding variables upfront. This matters because LLM-based experiments are becoming common for testing interventions cheaply at scale.
Published as The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study arXiv:2605.20767
Read the original paper →