Definition
Only sending learning signals through the last few steps of a long process instead of all of them.
A training technique for recurrent or unrolled architectures in which gradients are computed only over the last K steps; combined with MagicNorm in HRM-Text to stabilize deep recurrent language model training.
Also called: truncated backprop