Definition
A way of generating text that only picks from the most likely next-word candidates, ignoring the long tail.
A decoding method that samples from the smallest set of tokens whose cumulative probability exceeds a threshold p, also called top-p sampling.