Definition
A benchmark for testing whether an AI search agent picks the right web source.
An evaluation suite for generative-search source selection under GEO-poisoned and clean conditions, used to probe persuasion susceptibility in retrieval agents.