Glossary · Term

Echo

← all terms

Definition

A proposed alternative to attention that does retrieval without keeping a copy of every previous token in memory.

A KV-cache-free architecture that reframes attention as halfspace range searching and uses Koopman-style streaming sufficient statistics for retrieval with constant per-layer state.

Mentioned in 1 episode

  1. 033
    Echo: The Paper Arguing You Never Needed a KV Cache for Retrieval