Glossary · Term

binary reward

← all terms

Definition

A reward signal that's just yes or no — did it work or didn't it.

A scalar training signal taking values in {0,1}, typically used in RLVR pipelines where verifiable answer correctness is the only learning signal.

Also called: binary rewards

Mentioned in 4 episodes

  1. 059
    Firefly's Inversion: Building Verified Tool-Call Training Data by Working Backward
  2. 026
    What RL Actually Does to Language Models, at the Token Level
  3. 011
    When RL Actually Teaches Agents Something New, And When It Doesn't
  4. 008
    Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps