Why your AI simulation experiments might be misleading

Victoria Lin, Taedong Yun, Maja Matarić, John Canny, Arthur Gretton, Alexander D'Amour

LLMs trained on real-world data carry hidden biases that shift when you intervene experimentally. Asking an LLM-simulated user how they'd respond to a treatment can subtly alter their demographics or preferences in ways that bias your effect estimates. The authors show how to catch this problem using negative controls—outcomes that shouldn't change—and fix it by specifying confounding variables upfront. This matters because LLM-based experiments are becoming common for testing interventions cheaply at scale.