Glossary · Term

MaR

← all terms

Definition

A training method that rewards models for showing they know what they need and for following their own plans.

Metacognition as Reward — an RL framework whose reward combines knowledge monitoring, regulation fidelity (with multiplicative shortcut penalty), and correctness, derived from Flavell's metacognition framework.

Also called: Metacognition as Reward

Mentioned in 1 episode

  1. 079
    An Old Idea From Cognitive Psychology Reshapes How We Reward Reasoning Models