Glossary · Term

experience replay

Definition

Plain language

Storing rare past successes and showing them to a model again during training.

As stated in the literature

A reinforcement-learning technique that maintains a buffer of past trajectories—often rare successful ones—and samples from it during training to improve learning on sparse-reward tasks.

Also called: replay buffer

Why it matters: Without replay, rare wins get drowned out by common failures and the model never learns from its breakthroughs.

For example, an agent that finally solves a hard level once will be shown that successful run many more times during training so the lesson sticks.

Heard on the show

“The second mechanism is experience replay, and Bella, this is the one I think is genuinely elegant.”

Episode 048 — How a 30B Open Model Reached Olympiad Gold With the Right Recipe

Mentioned in 1 episode

048
How a 30B Open Model Reached Olympiad Gold With the Right Recipe

Related terms

trajectory