Glossary · Term

MiRA

← all terms

Definition

A reinforcement-learning recipe for long-horizon agents that uses milestone progress as a dense reward signal.

A subgoal-driven RL post-training method that trains a 'potential critic' on interpolated subgoal-completion targets and uses its temporal differences as a potential-based shaping reward alongside the binary task reward.

Mentioned in 1 episode

  1. 008
    Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps