← Back to Computation and Language
cs.CL

Why do language models choose certain words? A probability-based answer

Shilpika Shilpika, Carlo Graziani, Bethany Lusch, Venkatram Vishwanath, Michael E. Papka

May 20, 2026

Large language models generate text by sampling from probability distributions over tokens. This work inverts those probabilities using Bayes rule to create a attribution score that shows which input tokens pushed the model toward each output word—independent of the model's architecture. The measure reveals where models are uncertain or unstable during generation, offering a tool to understand what LLMs actually learned and where they're unreliable.
Published as Probabilistic Attribution For Large Language Models arXiv:2605.21726
Read the original paper →