Glossary · Term

Longformer

← all terms

Definition

A Transformer variant that handles long documents efficiently by attending only to a sliding window of nearby tokens.

A long-context attention architecture using a sliding window plus selected global tokens to achieve linear-time scaling, recombined as a building block in agent-designed efficient attention.

Mentioned in 1 episode

  1. 053
    An AI Agent Swapped In Focal Loss And Beat A Human-Tuned Training Script