Definition
A Transformer variant that handles long documents efficiently by attending only to a sliding window of nearby tokens.
A long-context attention architecture using a sliding window plus selected global tokens to achieve linear-time scaling, recombined as a building block in agent-designed efficient attention.