Why physics-inspired attention might beat transformers on memory tasks

Standard transformer attention solves problems by comparing all pairs of tokens—expensive and sometimes unnecessary. This work replaces softmax attention with a binary relay operator from physics, maintaining only the sequence of extreme values it has seen. The result: single-layer Turing-completeness and dramatic speedups on tasks requiring historical statistics (like finding min/max across context), though it can't do random lookup without extra memory. O(n log n) inference instead of O(n²).