Concept · 1 episode(s)

Reward Channel Addiction

← all concepts

Definition

Reward Channel Addiction is a failure mode in which a trained agent develops a learned dependency on directly observing its reward signal—such as a visible scoreboard—in order to select actions, rather than learning a policy grounded in the underlying task state. This is problematic because an agent that needs to “see its score” to act correctly becomes brittle: if the reward channel is obscured, corrupted, or manipulated, the agent’s behavior collapses even in situations where all task-relevant information is still available.

Episodes covering this