A vector database stores documents alongside their embeddings and answers queries by nearest-neighbour search in vector space. Because exact nearest-neighbour search is expensive at scale, these systems rely on approximate methods such as the HNSW graph index of Malkov & Yashunin (2018), and GPU-accelerated search libraries like FAISS (Johnson et al., 2017).
The result is sub-second semantic retrieval over millions of passages — the storage layer that makes retrieval-augmented generation practical.