Definition
A test that hides one specific fact deep inside a long document and checks if the AI can find it.
A long-context retrieval evaluation that inserts a unique fact at varied positions across an extended input and measures exact recall on the corresponding query.