Concept · 7 episode(s)

Self-Play / Self-Evolution

← all concepts

Definition

Self-play trains a model by having it play against versions of itself — in games, in dialog, in debate — using the improving opponent as an automatic curriculum. It’s how AlphaZero learned chess and a recurring template wherever reward is hard to specify but win/loss is cheap.

Episodes covering this

Worth reading next

Papers we haven't done a deep dive on yet, but would recommend on this topic.