← Back to Machine Learning
cs.LG

Why physics-inspired attention might beat transformers on memory tasks

Piotr Frydrych

May 22, 2026

Standard transformer attention solves problems by comparing all pairs of tokens—expensive and sometimes unnecessary. This work replaces softmax attention with a binary relay operator from physics, maintaining only the sequence of extreme values it has seen. The result: single-layer Turing-completeness and dramatic speedups on tasks requiring historical statistics (like finding min/max across context), though it can't do random lookup without extra memory. O(n log n) inference instead of O(n²).
Published as Preisach Attention: A Hysteretic Model of Sequential Memory arXiv:2605.23603
Read the original paper →