Definition
A diagnostic for AI agents that swaps in perfect components one at a time to see where the error actually lives.
A three-rung agent-failure diagnostic procedure: measure baseline accuracy, substitute oracle retrieval over the agent's own memory, then substitute ground-truth answers directly; the gap closing at each rung attributes error to retrieval, write, or utilization failure.