Getting Structured Output From LLMs
00:00 In this lesson, you’re going to learn how to use Pydantic AI to get structured output out of an LLM. And for that, it’s important to take a step back and realize that Pydantic, without the AI, was a package that was originally developed to allow you to do runtime type validation using type hints.
00:22 In Pydantic, you define a class, you specify the attributes for that class and their types, and at runtime, Pydantic will make sure that you only get values of the appropriate types. So Pydantic validates your class.
00:40
Pydantic took this functionality and brought it to the world of LLMs. So you’re going to define a class that’s going to inherit from Pydantic’s BaseModel, and in your class, you define all of the attributes that you want.
00:54 So those will be the information you want from the LLM and specify the types. When you make a request to the LLM through Pydantic AI, you just say, hey, I want the response to have this particular type, and you point it to your own model.
01:10 And so Pydantic AI will reach out to the LLM, it will make a request, and when it gets the answer from the LLM, it will try to parse the appropriate values with the appropriate types to build the model you specified.
01:24
So let’s see this in action. Go ahead and create a file called structured_output.py. Now, the first thing you need is to go to Pydantic, the base package, not Pydantic AI, but just Pydantic, and from Pydantic, you want to import the base model. So all of your models, when you need structured output, should inherit from BaseModel.
01:48
And from Pydantic AI, you’re going to need your Agent. Suppose you want to ask your LLM about different countries, and you want to get information from those countries. Well, you could say, let’s define a model called CountryInfo that inherits from BaseModel, and the information you care about, well, that’s going to be the name of the country, obviously, you want the capital, both the name of the country and the capital are strings, and then you could say you also want the population, and then you define that as an integer, or you say it’s going to be an integer with the type hints. And then you just instantiate your agent by passing in the identification of the provider and the name of the model you want to use.
02:32
And now, when you interact with the agent through the method run_sync,
02:36
you can ask your question or you can say something like, tell me about Japan, which is a country, and then as a second argument, you want to say that the output type should be CountryInfo.
02:49 And when you’re done, you just want to print the output of the results.
02:55
Now, if you run your script and include the -i flag there, to be dropped in an interactive prompt after the code runs, so you can see that the structured information was printed on the screen, name equals Japan, capital equals Tokyo, population equals 125 million people.
03:14
And if you inspect the output of the results, you can see that it is an instance of the model you specified. So you could do something like result.output.population if you wanted to access the population directly. So this is the core benefit of using Pydantic AI to interact with LLMs is this ability to get structured output that comes neatly packaged in the models that you define.
03:41 Now, this interaction with the LLM and specifying the output type, this is not deterministic and it’s not guaranteed that it will work. So in the next lesson, you’re going to understand how to control the reliability of this technique using Pydantic AI.
Become a Member to join the conversation.
