Glossary · Term

induction head

← all terms

Definition

A small look-back-and-copy circuit inside a transformer that finds repeated patterns and predicts what came after them last time.

An attention-head circuit identified in mechanistic interpretability that performs in-context pattern completion by attending to prior occurrences of the current token and copying their successors; implicated in semantic collapse of multi-LLM conversations.

Also called: induction heads

Mentioned in 1 episode

  1. 073
    When Three LLMs Talk to Each Other, Their Ideas Quietly Stop Moving