Glossary · Term

verifiable reward

Definition

Plain language

A reward you can compute automatically from the answer, without needing a human grader.

As stated in the literature

A scalar training signal derived from mechanical verification of task completion (calculator, compiler, simulator, formal verifier); enables scalable RL training but provides only outcome-level supervision.

Also called: verifiable rewards

Why it matters: It removes the need for human raters in the training loop, which is what makes large-scale RL on math and code feasible.

For example, a math RL pipeline can run the model's final answer through a calculator and award 1 for an exact match, 0 otherwise.

Heard on the show

“And second — the entire analysis assumes sparse, verifiable rewards.”

Episode 162 — The Empty-Lake Proof: Why More Rollouts Stop Helping Reasoning Models

Mentioned in 4 episodes

162
The Empty-Lake Proof: Why More Rollouts Stop Helping Reasoning Models
099
How an Open-Book Trick Teaches a Model to Catch Its Own Mistakes
090
How MiniMax-M2 Bets That Sparsity Plus Verifiable Rewards Can Match Frontier Agents
011
When RL Actually Teaches Agents Something New, And When It Doesn't

Related terms

compiler reinforcement learning verifier