Glossary · Term

temporal contrast

← all terms

Definition

Training a model by comparing its newer outputs to its older ones, treating the newer as preferred.

A self-supervised training signal that constructs preference pairs across training checkpoints, labeling later outputs as preferred over earlier ones to bootstrap rubric learning without external labels.

Mentioned in 1 episode

  1. 019
    When the Best Reward Model Trains the Worst Policy: Inside EvoLM