Glossary · Term

path-efficiency reward

← all terms

Definition

A reward that pays an AI agent more when it finishes a task in fewer steps than its peers.

In ToolCUA's RL recipe, a group-relative reward term granting a linear bonus for sub-average step counts and applying an exponentially decaying penalty above average, asymmetric to encourage shorter trajectories.

Mentioned in 1 episode

  1. 066
    Why Giving an AI Agent More Tools Can Make It Worse at Using a Computer