Why language models fail when context matters most

Language models learn from text alone, but actual communication depends on invisible context: intentions, facts, social setting, task constraints. This paper formalizes when next-token prediction succeeds—only when observed text is a sufficient statistic for all relevant circumstances. When that fails (the typical case), models hallucinate or err predictably. RAG and tool use fix this by reintroducing missing context. The framework explains why training on heterogeneous corpora creates brittle systems.