Concept · 1 episode(s)

Reward Variance

← all concepts

Definition

Reward variance is the noisiness of the reward signal across runs or samples — high variance makes credit assignment harder and policy gradients sloppier. It’s why a lot of effective RL work is really variance-reduction work in disguise.

Episodes covering this

Worth reading next

Papers we haven't done a deep dive on yet, but would recommend on this topic.