Glossary · Term

teacher forcing

← all terms

Definition

During training, feeding the model the correct earlier tokens instead of letting it use its own predictions.

A training strategy where the model conditions on ground-truth previous tokens rather than its own outputs, enabling parallel training but creating exposure bias.

Mentioned in 1 episode

  1. 032
    A Sticky-Note for Every Layer: Letting Transformers Remember What They Were Just Thinking