Definition
A proposed alternative to attention that does retrieval without keeping a copy of every previous token in memory.
A KV-cache-free architecture that reframes attention as halfspace range searching and uses Koopman-style streaming sufficient statistics for retrieval with constant per-layer state.