Definition
A benchmark that tests whether AI assistants know when old facts about you are no longer true.
A benchmark evaluating LLM agents on implicit conflict detection in long-term memory, separating state resolution, premise resistance, and downstream policy adaptation.
Also called: STALE