Glossary · Term

needle in a haystack

← all terms

Definition

A test that hides one specific fact deep inside a long document and checks if the AI can find it.

A long-context retrieval evaluation that inserts a unique fact at varied positions across an extended input and measures exact recall on the corresponding query.

Mentioned in 2 episodes

  1. 072
    A Robot Made Graphene Without Help, And Caught Itself Hallucinating
  2. 014
    Why a Constrained Pipeline Beat a Full Coding Agent at Finding Bugs 30-to-1