greedy decoding · Glossary · AI Papers: A Deep Dive

Definition

Plain language

The simplest way for a language model to generate text — at each step it just picks the single most likely next word.

As stated in the literature

A decoding strategy that selects the argmax token at each step from the model's output distribution; the standard choice for factual QA and the regime in which commitment failures are most visible.

Why it matters: It's the cheapest and most reproducible way to generate text, but it commits early and can lock in mistakes that more exploratory decoders would avoid.

For example, with greedy decoding a model asked '2 + 2 = ' will pick whatever single token has the highest probability at each step, even if the second-most-likely token would lead to a more accurate continuation.

Heard on the show

“On a small Gemma model, success goes from thirty-three percent with greedy decoding up to thirty-nine with progress-advantage selection.”

Episode 173 — The Free Step-Level Grader Hiding in Every RL Training Run

Mentioned in 4 episodes

Related terms

argmax commitment failure decoding token