Concept · 2 episode(s)

Attention Analysis

← all concepts

Definition

Attention analysis studies the attention patterns inside a transformer — which tokens look at which others, and how that pattern relates to model behavior. It’s a popular interpretability lens, with the caveat that attention weights aren’t straightforwardly “the explanation” for what a model does.

Episodes covering this