Glossary · Term

fast weight

← all terms

Definition

A small running summary inside a sequence model that gets updated as new tokens arrive.

A fixed-size recurrent state (matrix or vector) in state-space and linear-attention models that compresses sequence history through gated overwrites; contrasts with the lossless but growing KV cache of attention layers.

Also called: fast weights

Mentioned in 1 episode

  1. 085
    Why Long-Context Models Might Need Compute, Not Capacity, Before Eviction