← Back to Computation and Language cs.CL
Why bigger language models hallucinate when they know the answer
Jewon Yeom, Jaewon Sok, Heejun Kim, Seonghyeon Park, Jeongjae Park, Taesup Kim
May 21, 2026
Large language models often hallucinate despite having the correct answer available in their internal representations. Researchers analyzed Qwen and Llama models (0.8B–72B parameters) and found that hallucinations spike with scale not because knowledge is missing, but because instruction tuning causes models to distribute probability mass across multiple surface forms of the correct answer rather than concentrating on one. The same sharpening mechanism that makes models more helpful also makes them more confidently wrong.
Read the original paper →