Holtzman et al. (2020) showed that always choosing the most likely tokens makes generated text bland and repetitive, while sampling from the full distribution makes it incoherent. Their fix, nucleus (top-p) sampling, restricts each choice to the dynamic "nucleus" of tokens that together hold probability mass p, then samples from that set.
Top-p is paired with temperature in most chat APIs to balance variety against coherence; the related top-k method instead keeps a fixed number of candidates.