Glossary · Term

STALE benchmark

← all terms

Definition

A benchmark that tests whether AI assistants know when old facts about you are no longer true.

A benchmark evaluating LLM agents on implicit conflict detection in long-term memory, separating state resolution, premise resistance, and downstream policy adaptation.

Also called: STALE

Mentioned in 1 episode

  1. 031
    When Your AI Assistant Won't Let Go of Old Facts About You