Glossary · Term

REINFORCE

← all terms

Definition

A classic reinforcement-learning rule where you push the model toward whatever it did when it won, and away from what it did when it lost.

A Monte Carlo policy-gradient algorithm that updates parameters in the direction of log-probability of sampled actions weighted by return.

Mentioned in 1 episode

  1. 060
    When Splitting One Model Across Three Agents Doubles Its Accuracy

Related concepts