Glossary · Term

MLP layer

← all terms

Definition

The feed-forward block in each transformer layer that sits next to attention and stores much of the model's knowledge.

The multi-layer perceptron sub-block in a transformer layer, applying point-wise feedforward transformations and holding much of the model's factual content.

Also called: MLP layers, MLP

Mentioned in 4 episodes

  1. 077
    Reading a Model's Confidence Curve to Decide When Chain-of-Thought Is Worth It
  2. 053
    An AI Agent Swapped In Focal Loss And Beat A Human-Tuned Training Script
  3. 037
    Why Hallucination Detectors Miss Stale Facts: A Geometric Story About What Models Know But Don't Say
  4. 023
    Why a Small Agent Confidently Overwrites Memories It Doesn't Understand

Related concepts