← Back to Computation and Language
cs.CL

Can language models think without writing their thoughts?

Lukas Aichberger, Sepp Hochreiter

May 28, 2026

Large language models typically show their reasoning by generating intermediate steps token-by-token—expensive and couples thinking to output. This work introduces Reasoning in Memory (RiM), which replaces that autoregressive chain-of-thought with fixed special tokens that function as working memory, processed in a single forward pass. Trained via curriculum learning on math and logic benchmarks, RiM matches or beats existing latent reasoning methods while reducing compute overhead.
Published as Unlocking the Working Memory of Large Language Models for Latent Reasoning arXiv:2605.30343
Read the original paper →