← Back to Artificial Intelligence
cs.AI

Why bigger language models remember facts better—and predictably so

Matthew L. Smith, Jonathan P. Shock, Samuel T. Segun, Iyiola E. Olatunji, Tegawendé F. Bissyandé

May 18, 2026

Large language models hallucinate confidently, but their ability to recall actual facts follows a mathematical pattern. Researchers tested 38 models on over 8,900 scholarly references and found that factual accuracy scales predictably with model size and training-data frequency: a bigger model learning about uncommon topics performs like a smaller model learning about common ones. The relationship follows a sigmoid curve in the combination of these two factors, suggesting recall works like a signal-to-noise problem—more frequent topics cut through the noise, larger models reduce the noise floor.
Published as Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency arXiv:2605.18732
Read the original paper →