← Back to Machine Learning cs.LG
Why physics-inspired attention might beat transformers on memory tasks
Piotr Frydrych
May 22, 2026
Standard transformer attention solves problems by comparing all pairs of tokens—expensive and sometimes unnecessary. This work replaces softmax attention with a binary relay operator from physics, maintaining only the sequence of extreme values it has seen. The result: single-layer Turing-completeness and dramatic speedups on tasks requiring historical statistics (like finding min/max across context), though it can't do random lookup without extra memory. O(n log n) inference instead of O(n²).
Read the original paper →