Glossary · Term

warmup deep credit assignment

← all terms

Definition

A training trick that starts with a short backward chain and slowly lengthens it as the model stabilizes.

In HRM-Text, a curriculum that begins with a short truncated-backprop horizon (e.g., 2 steps) and gradually extends it across training to stabilize gradient flow through recurrent unrolls.

Mentioned in 1 episode

  1. 074
    How a Fifteen-Hundred-Dollar Training Run Matched Llama and Gemma on Reasoning