Glossary · Term

LAMBADA

Definition

Plain language

A benchmark that tests whether a model can predict the last word of a passage that requires understanding the whole context.

As stated in the literature

A long-context word prediction benchmark requiring broad discourse understanding; a standard zero-shot evaluation for pretrained language models.

Also called: Lambada

Why it matters: It's a classic check that a model is doing real discourse-level prediction rather than just local pattern completion.

For example, the model reads a short paragraph that ends mid-sentence and must predict the final word, which only makes sense if it has tracked the whole story.

Heard on the show

“… On downstream evaluations — Lambada and the CORE benchmark suite — the paper reports relative improvements up to nearly twenty percent …”

Episode 041 — When the Iteration Teaches the Model to Skip the Iteration

Mentioned in 2 episodes

041
When the Iteration Teaches the Model to Skip the Iteration
033
Echo: The Paper Arguing You Never Needed a KV Cache for Retrieval

Related terms

long-context pretraining zero-shot