Glossary · Term

counterfactual ladder

Definition

A diagnostic for AI agents that swaps in perfect components one at a time to see where the error actually lives.

A three-rung agent-failure diagnostic procedure: measure baseline accuracy, substitute oracle retrieval over the agent's own memory, then substitute ground-truth answers directly; the gap closing at each rung attributes error to retrieval, write, or utilization failure.

Mentioned in 1 episode

086
Why Frozen-Weight Agents Still Get Worse Over Time