🛡️
Session Flagged

Your session has been flagged for unusual activity.

You can try our app by searching for MultipleChat AI on Google and clicking the multiplechat.ai link to try it free.
Quick verification

Please confirm you're human to continue.


Decoding Updated 2026

Nucleus Sampling (Top-p)

A decoding method that samples the next token from the smallest set whose probabilities add up to p, avoiding both repetitive and incoherent text.

Holtzman et al. (2020) showed that always choosing the most likely tokens makes generated text bland and repetitive, while sampling from the full distribution makes it incoherent. Their fix, nucleus (top-p) sampling, restricts each choice to the dynamic "nucleus" of tokens that together hold probability mass p, then samples from that set.

Top-p is paired with temperature in most chat APIs to balance variety against coherence; the related top-k method instead keeps a fixed number of candidates.

References

Primary, peer-reviewed and archival sources for this definition.

The Curious Case of Neural Text Degeneration
Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2020). International Conference on Learning Representations (ICLR 2020).

Dictionary & encyclopedic entries

Cite this entry

MultipleChat. "Nucleus Sampling (Top-p)." MultipleChat AI & LLM Glossary, 2026. https://multiple.chat/ai-glossary/nucleus-sampling

Related terms

See this in practice

Run the same prompt across ChatGPT, Claude, Gemini and Grok — grounded in your own sources, cross-checked against each other.

Try MultipleChat Free

Continue learning

See paid plans