← Back to Computation and Language cs.CL
Why do language models choose certain words? A probability-based answer
Shilpika Shilpika, Carlo Graziani, Bethany Lusch, Venkatram Vishwanath, Michael E. Papka
May 20, 2026
Large language models generate text by sampling from probability distributions over tokens. This work inverts those probabilities using Bayes rule to create a attribution score that shows which input tokens pushed the model toward each output word—independent of the model's architecture. The measure reveals where models are uncertain or unstable during generation, offering a tool to understand what LLMs actually learned and where they're unreliable.
Read the original paper →