Explaining RAG

Steven Loyens

Using LlamaIndex for RAG in Python Steven Loyens 02:47

Transcript
Discussion

00:00 I know you’re quite eager to get coding, but there are two things to get through before you can start coding. So in this lesson, you’ll discover what RAG is, how you can use it, and then in the next lesson, you’ll need to set up your environment.

00:15 RAG stands for Retrieval-Augmented Generation, and it’s a technique where the system, at query time, first retrieves relevant external documents or data and then passes them to the LLM as additional context. So firstly, external data, what does that mean? Well, it’s data that is otherwise unavailable to the LLM.

00:38 So during the intro, I was talking about secret documents that are not in the public domain, for example. And then context, well, the context is considered a source of truth by the model. So that means that the response is on topic because it considers what the documents that you have provided as truth.

00:59 There are five stages to RAG. Firstly, it’s loading. So that is retrieving data from its source, such as my wife’s recipes or the secret company policy that you’ll be working with.

01:11 Secondly, there’s indexing. So that means creating a data structure for querying the data. The AI model can actually interpret the data. And so there are things called nodes and embeddings that you will come back to later in the course. Then there’s persisting.

01:28 That means storing index and its metadata to avoid re-indexing. You will cover that as well later in the course. And then there’s querying, of course. That’s the fun bit.

01:39 That is using the indexed data as context and asking all sorts of interesting questions of your AI bot. Finally, there’s evaluation, which is checking the model output.

01:51 We won’t be doing stage five in this course, but it is something, of course, that in real life you should do. What’s the LlamaIndex then? What’s that got to do with anything?

02:02 Well, that is a Python framework and that enables you to build AI-powered apps capable of performing RAG. And it feeds LLMs with your own data, such as secret documents, and then through indexing and retrieval tools.

02:17 So it has all the tools built in to do RAG. It uses OpenAI’s LLM by default. So this is important. This is why you need the OpenAI API key to code along. But you can change the LLM it uses.

02:33 As I said before, you could change it to the Google LLM, which at the time of recording was free to use.

02:41 Next step then for you is to set up your environment in the next lesson.

Become a Member to join the conversation.