Retrieval-Augmented Generation combines a retriever that searches an external document collection with a generator language model. Lewis et al. (2020) introduced it, pairing a parametric seq2seq model with a non-parametric vector index of Wikipedia and showing it produced more factual, specific text and set state-of-the-art results on open-domain question answering.
Most modern RAG systems retrieve with dense embeddings, following Dense Passage Retrieval (Karpukhin et al., 2020); the idea of joining retrieval to a language model was also developed in REALM (Guu et al., 2020).
Why it matters at MultipleChat
Because the knowledge lives in an external, updatable index rather than the model's frozen weights, RAG is the standard remedy for stale or fabricated facts. MultipleChat grounds each model in the same retrieved sources, then cross-checks their answers against that evidence.