Glossary · Term

transformer

← all terms

Definition

The dominant neural network design behind modern language models, built around attention.

The architecture introduced in 2017 combining self-attention with feedforward blocks and residual connections, underlying nearly all current frontier LLMs.

Also called: transformers, Transformer

Mentioned in 13 episodes

  1. 078
    Training a Markdown File: When LLM Self-Improvement Borrows the Discipline of Neural Net Training
  2. 074
    How a Fifteen-Hundred-Dollar Training Run Matched Llama and Gemma on Reasoning
  3. 053
    An AI Agent Swapped In Focal Loss And Beat A Human-Tuned Training Script
  4. 041
    When the Iteration Teaches the Model to Skip the Iteration
  5. 040
    Two Frozen Models Learn to Whisper: Coupling Through Hidden States
  6. 038
    How LLMs Get Persuaded: One Attention Head, A Tetrahedron, And A Single Dial
  7. 036
    Sparse Attention Was the Wrong Frame. Treat It as Geometry Instead.
  8. 033
    Echo: The Paper Arguing You Never Needed a KV Cache for Retrieval
  9. 032
    A Sticky-Note for Every Layer: Letting Transformers Remember What They Were Just Thinking
  10. 027
    When AI Agents Build the Serving Stack: A Bet on Bespoke Infrastructure
  11. 023
    Why a Small Agent Confidently Overwrites Memories It Doesn't Understand
  12. 016
    Why Your Coding Agent Stalls While the GPU Runs Hot
  13. 002
    An AI Ran a Real Optics Lab for 21 Hours and Found a Transformer-Shaped Pattern in Light

Related concepts