Glossary · Term

shortcut penalty

← all terms

Definition

A penalty applied when a model writes out a plan but doesn't follow it.

In MaR's regulation reward, a multiplicative penalty (e.g., 0.7x) applied when a binary shortcut flag fires, indicating the model produced a plan but jumped to the answer without executing it.

Mentioned in 1 episode

  1. 079
    An Old Idea From Cognitive Psychology Reshapes How We Reward Reasoning Models