Definition
Re-running real conversation transcripts through a model while watching what happens inside.
An interpretability technique where saved trajectories are fed through an open-weight model with intermediate activations and predictions logged, enabling mechanistic analysis without re-generating; used to study induction heads in semantic collapse.