← Back to Computation and Language cs.CL
Do AI chatbots actually know the news they report?
Mirac Suzgun, Emily Shen, Federico Bianchi, Alexander Spangher, Thomas Icard, Daniel E. Ho, Dan Jurafsky, James Zou
May 21, 2026
Researchers tested six commercial AI chatbots (GPT-5, Claude, Gemini, Grok) against 2,100 real news questions from BBC coverage across six languages and regions. The systems aced multiple-choice (90%+ accuracy) but stumbled badly on free-response answers, and crashed to 19–70% accuracy when questions contained false premises. The core finding: retrieval failures, not reasoning gaps, cause most errors—and every model performs worst on Hindi, with strong English-language bias in their underlying sources.
Read the original paper →