retrieval-augmented generation (RAG)

Retrieval‑augmented generation (RAG) is a technique in which a large language model (LLM) first retrieves relevant external documents at query time and then uses them as additional context when generating its answer.

A typical RAG pipeline involves these steps:

Encode the user query
Search a knowledge source, such as a vector index or web/document corpus
Rank and select the most relevant passages
Assemble those passages into the prompt so the generator can produce an answer grounded in retrieved evidence

RAG improves over purely parametric models in terms of factuality, timeliness, and ease of updates, because you can refresh the knowledge source instead of retraining the whole model. However, the output quality still largely depends on retrieval coverage, chunking strategy, and ranking/selection quality.

Tutorial

Build an LLM RAG Chatbot With LangChain

Large language models (LLMs) have taken the world by storm, demonstrating unprecedented capabilities in natural language tasks. In this step-by-step tutorial, you'll leverage LLMs to build your own retrieval-augmented generation (RAG) chatbot using synthetic data with LangChain and Neo4j.

intermediate ai databases data-science

For additional information on related topics, take a look at the following resources:

First Steps With LangChain (Course)
Build an LLM RAG Chatbot With LangChain (Quiz)

By Leodanis Pozo Ramos • Updated Dec. 8, 2025

AI Coding Glossary Share Feedback

retrieval-augmented generation (RAG)

Related Resources

Build an LLM RAG Chatbot With LangChain