Definition
A reward signal that's just yes or no — did it work or didn't it.
A scalar training signal taking values in {0,1}, typically used in RLVR pipelines where verifiable answer correctness is the only learning signal.
Also called: binary rewards