🛡️
Session Flagged

Your session has been flagged for unusual activity.

You can try our app by searching for MultipleChat AI on Google and clicking the multiplechat.ai link to try it free.
Quick verification

Please confirm you're human to continue.


TrainingAlignment Updated 2026

RLHF (Reinforcement Learning from Human Feedback)

A training method that uses human preference ratings to teach a model which responses people actually prefer, making it more helpful and better aligned.

RLHF fits a reward model to human comparisons of model outputs, then optimises the language model against that reward with reinforcement learning. Christiano et al. (2017) introduced learning from human preferences; Stiennon et al. (2020) applied it to summarisation; and Ouyang et al. (2022) used it to build InstructGPT, the recipe behind today's instruction-following assistants.

RLHF is much of what separates a raw next-token predictor from a model that feels helpful, honest and safe to talk to.

References

Primary, peer-reviewed and archival sources for this definition.

Deep Reinforcement Learning from Human Preferences
Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S., & Amodei, D. (2017). Advances in Neural Information Processing Systems 30 (NeurIPS 2017).
Learning to summarize from human feedback
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
Training language models to follow instructions with human feedback
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., et al. (2022). Advances in Neural Information Processing Systems 35 (NeurIPS 2022).

Dictionary & encyclopedic entries

Cite this entry

MultipleChat. "RLHF (Reinforcement Learning from Human Feedback)." MultipleChat AI & LLM Glossary, 2026. https://multiple.chat/ai-glossary/rlhf

Related terms

See this in practice

Run the same prompt across ChatGPT, Claude, Gemini and Grok — grounded in your own sources, cross-checked against each other.

Try MultipleChat Free

Continue learning

See paid plans