Definition
When a language model has the right answer in its head but locks onto a wrong word anyway.
A hallucination class in which the model assigns substantial probability mass to the correct concept across spelling variants yet greedy decoding emits a wrong-concept token; the dominant failure mode of large instruction-tuned LLMs on short-form QA.
Also called: commitment failures