← Back to Computation and Language cs.CL
Can language models think without writing their thoughts?
Lukas Aichberger, Sepp Hochreiter
May 28, 2026
Large language models typically show their reasoning by generating intermediate steps token-by-token—expensive and couples thinking to output. This work introduces Reasoning in Memory (RiM), which replaces that autoregressive chain-of-thought with fixed special tokens that function as working memory, processed in a single forward pass. Trained via curriculum learning on math and logic benchmarks, RiM matches or beats existing latent reasoning methods while reducing compute overhead.
Read the original paper →