Definition
When the answers to a test have leaked into a model's training data, making the score misleading.
The presence of evaluation data in a model's training corpus, inflating apparent benchmark performance and undermining held-out evaluation.
When the answers to a test have leaked into a model's training data, making the score misleading.
The presence of evaluation data in a model's training corpus, inflating apparent benchmark performance and undermining held-out evaluation.