Glossary · Term

diversity reward

← all terms

Definition

A training signal that pays the model for saying something new.

An RL reward shaping term penalizing semantic similarity to recent and anchor outputs, used in the semantic-collapse paper as a direct test of whether collapse can be undone by optimization.