Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Using Pydantic to Simplify Python Data Validation
Pydantic is a powerful data validation and settings management library for Python, engineered to enhance the robustness and reliability of your codebase. From basic tasks, such as checking whether a variable is an integer, to more complex tasks, like ensuring highly-nested dictionary keys and values have the correct data types, Pydantic can handle just about any data validation scenario with minimal boilerplate code.
In this tutorial, you’ll learn how to:
- Work with data schemas with Pydantic’s
BaseModel
- Write custom validators for complex use cases
- Validate function arguments with Pydantic’s
@validate_call
- Manage settings and configure applications with
pydantic-settings
Throughout this tutorial, you’ll get hands-on examples of Pydantic’s functionalities, and by the end you’ll have a solid foundation for your own validation use cases. Before starting this tutorial, you’ll benefit from having an intermediate understanding of Python and object-oriented programming.
Get Your Code: Click here to download the free sample code that you’ll use to help you learn how Pydantic can help you simplify data validation in Python.
Take the Quiz: Test your knowledge with our interactive “Pydantic: Simplifying Data Validation in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Pydantic: Simplifying Data Validation in PythonIn this quiz, you'll test your understanding of Pydantic, a powerful data validation library for Python. You'll revisit concepts such as working with data schemas, writing custom validators, validating function arguments, and managing settings with pydantic-settings.
Python’s Pydantic Library
One of Python’s main attractions is that it’s a dynamically typed language. Dynamic typing means that variable types are determined at runtime, unlike statically typed languages where they are explicitly declared at compile time. While dynamic typing is great for rapid development and ease of use, you often need more robust type checking and data validation for real-world applications. This is where Python’s Pydantic library has you covered.
Pydantic has quickly gained popularity, and it’s now the most widely used data validation library for Python. In this first section, you’ll get an overview of Pydantic and a preview of the library’s powerful features. You’ll also learn how to install Pydantic along with the additional dependencies you’ll need for this tutorial.
Getting Familiar With Pydantic
Pydantic is a powerful Python library that leverages type hints to help you easily validate and serialize your data schemas. This makes your code more robust, readable, concise, and easier to debug. Pydantic also integrates well with many popular static typing tools and IDEs, which allows you to catch schema issues before running your code.
Some of Pydantic’s distinguishing features include:
-
Customization: There’s almost no limit to the kinds of data you can validate with Pydantic. From primitive Python types to highly nested data structures, Pydantic lets you validate and serialize nearly any Python object.
-
Flexibility: Pydantic gives you control over how strict or lax you want to be when validating your data. In some cases, you might want to coerce incoming data to the correct type. For example, you could accept data that’s intended to be a float but is received as an integer. In other cases, you might want to strictly enforce the data types you’re receiving. Pydantic enables you to do either.
-
Serialization: You can serialize and deserialize Pydantic objects as dictionaries and JSON strings. This means that you can seamlessly convert your Pydantic objects to and from JSON. This capability has led to self-documenting APIs and integration with just about any tool that supports JSON schemas.
-
Performance: Thanks to its core validation logic written in Rust, Pydantic is exceptionally fast. This performance advantage gives you swift and reliable data processing, especially in high-throughput applications such as REST APIs that need to scale to a large number of requests.
-
Ecosystem and Industry Adoption: Pydantic is a dependency of many popular Python libraries such as FastAPI, LangChain, and Polars. It’s also used by most of the largest tech companies and throughout many other industries. This is a testament to Pydantic’s community support, reliability, and resilience.
These are a few key features that make Pydantic an attractive data validation library, and you’ll get to see these in action throughout this tutorial. Up next, you’ll get an overview of how to install Pydantic along with its various dependencies.
Installing Pydantic
Pydantic is available on PyPI, and you can install it with pip. Open a terminal or command prompt, create a new virtual environment, and then run the following command to install Pydantic:
(venv) $ python -m pip install pydantic
This command will install the latest version of Pydantic from PyPI onto your machine. To verify that the installation was successful, start a Python REPL and import Pydantic:
>>> import pydantic
If the import runs without error, then you’ve successfully installed Pydantic, and you now have the core of Pydantic installed on your system.
Adding Optional Dependencies
You can install optional dependencies with Pydantic as well. For example, you’ll be working with email validation in this tutorial, and you can include these dependencies in your install:
(venv) $ python -m pip install "pydantic[email]"
Pydantic has a separate package for settings management, which you’ll also cover in this tutorial. To install this, run the following command:
(venv) $ python -m pip install pydantic-settings
With that, you’ve installed all the dependencies you’ll need for this tutorial, and you’re ready to start exploring Pydantic. You’ll start by covering models—Pydantic’s primary way of defining data schemas.
Using Models
Pydantic’s primary way of defining data schemas is through models. A Pydantic model is an object, similar to a Python dataclass, that defines and stores data about an entity with annotated fields. Unlike dataclasses, Pydantic’s focus is centered around automatic data parsing, validation, and serialization.
The best way to understand this is to create your own models, and that’s what you’ll do next.
Working With Pydantic BaseModels
Suppose you’re building an application used by a human resources department to manage employee information, and you need a way to verify that new employee information is in the correct form. For example, each employee should have an ID, name, email, date of birth, salary, department, and benefits selection. This is a perfect use case for a Pydantic model!
To define your employee model, you create a class that inherits from Pydantic’s BaseModel
:
pydantic_models.py
from datetime import date
from uuid import UUID, uuid4
from enum import Enum
from pydantic import BaseModel, EmailStr
class Department(Enum):
HR = "HR"
SALES = "SALES"
IT = "IT"
ENGINEERING = "ENGINEERING"
class Employee(BaseModel):
employee_id: UUID = uuid4()
name: str
email: EmailStr
date_of_birth: date
salary: float
department: Department
elected_benefits: bool
First, you import the dependencies you need to define your employee model. You then create an enum to represent the different departments in your company, and you’ll use this to annotate the department
field in your employee model.
Then, you define your Pydantic model, Employee
, which inherits from BaseModel
and defines the names and expected types of your employee fields via annotations. Here’s a breakdown of each field you’ve defined in Employee
and how Pydantic validates it when an Employee
object is instantiated:
employee_id
: This is the UUID for the employee you want to store information for. By using theUUID
annotation, Pydantic ensures this field is always a valid UUID. Each instance ofEmployee
will be assigned a UUID by default, as you specified by callinguuid4()
.name
: The employee’s name, which Pydantic expects to be a string.email
: Pydantic will ensure that each employeeemail
is valid by using Python’semail-validator
library under the hood.date_of_birth
: Each employee’s date of birth must be a valid date, as annotated bydate
from Python’sdatetime
module. If you pass a string intodate_of_birth
, Pydantic will attempt to parse and convert it to adate
object.salary
: This is the employee’s salary, and it’s expected to be a float.department
: Each employee’s department must be one ofHR
,SALES
,IT
, orENGINEERING
, as defined in yourDepartment
enum.elected_benefits
: This field stores whether the employee has elected benefits, and Pydantic expects it to be a Boolean.
The simplest way to create an Employee
object is to instantiate it as you would any other Python object. To do this, open a Python REPL and run the following code:
>>> from pydantic_models import Employee
>>> Employee(
... name="Chris DeTuma",
... email="cdetuma@example.com",
... date_of_birth="1998-04-02",
... salary=123_000.00,
... department="IT",
... elected_benefits=True,
... )
Employee(
employee_id=UUID('73636d47-373b-40cd-a005-4819a84d9ea7'),
name='Chris DeTuma',
email='cdetuma@example.com',
date_of_birth=datetime.date(1998, 4, 2),
salary=123000.0,
department=<Department.IT: 'IT'>,
elected_benefits=True
)
In this block, you import Employee
and create an object with all of the required employee fields. Pydantic successfully validates and coerces the fields you passed in, and it creates a valid Employee
object. Notice how Pydantic automatically converts your date string into a date
object and your IT
string to its respective Department
enum.
Next, look at how Pydantic responds when you try to pass invalid data to an Employee
instance:
>>> Employee(
... employee_id="123",
... name=False,
... email="cdetumaexamplecom",
... date_of_birth="1939804-02",
... salary="high paying",
... department="PRODUCT",
... elected_benefits=300,
... )
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 7 validation errors for
Employee
employee_id
Input should be a valid UUID, invalid length: expected length 32 for
simple format, found 3 [type=uuid_parsing, input_value='123',
input_type=str] For further information visit
https://errors.pydantic.dev/2.6/v/uuid_parsing
name
Input should be a valid string [type=string_type, input_value=False,
input_type=bool] For further information visit
https://errors.pydantic.dev/2.6/v/string_type
email
value is not a valid email address: The email address is not valid.
It must have exactly one @-sign. [type=value_error,
input_value='cdetumaexamplecom', input_type=str]
date_of_birth
Input should be a valid date or datetime, invalid date separator,
expected `-` [type=date_from_datetime_parsing,
input_value='1939804-02', input_type=str] For further information
visit https://errors.pydantic.dev/2.6/v/date_from_datetime_parsing
salary
Input should be a valid number, unable to parse string as a number
[type=float_parsing, input_value='high paying', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/float_parsing
department
Input should be 'HR', 'SALES', 'IT' or 'ENGINEERING'
[type=enum, input_value='PRODUCT', input_type=str]
elected_benefits
Input should be a valid boolean, unable to interpret input
[type=bool_parsing, input_value=300, input_type=int]
For further information visit
https://errors.pydantic.dev/2.6/v/bool_parsing
In this example, you created an Employee
object with invalid data fields. Pydantic gives you a detailed error message for each field, telling you what was expected, what was received, and where you can go to learn more about the error.
This detailed validation is powerful because it prevents you from storing invalid data in Employee
. This also gives you confidence that the Employee
objects you instantiate without errors contain the data you’re expecting, and you can trust this data downstream in your code or in other applications.
Pydantic’s BaseModel
is equipped with a suite of methods that make it easy to create models from other objects, such as dictionaries and JSON. For example, if you want to instantiate an Employee
object from a dictionary, you can use the .model_validate()
class method:
>>> new_employee_dict = {
... "name": "Chris DeTuma",
... "email": "cdetuma@example.com",
... "date_of_birth": "1998-04-02",
... "salary": 123_000.00,
... "department": "IT",
... "elected_benefits": True,
... }
>>> Employee.model_validate(new_employee_dict)
Employee(
employee_id=UUID('73636d47-373b-40cd-a005-4819a84d9ea7'),
name='Chris DeTuma',
email='cdetuma@example.com',
date_of_birth=datetime.date(1998, 4, 2),
salary=123000.0,
department=<Department.IT: 'IT'>,
elected_benefits=True
)
Here, you create new_employee_dict
, a dictionary with your employee fields, and pass it into .model_validate()
to create an Employee
instance. Under the hood, Pydantic validates each dictionary entry to ensure it conforms with the data you’re expecting. If any of the data is invalid, Pydantic will throw an error in the same way you saw previously. You’ll also be notified if any fields are missing from the dictionary.
You can do the same thing with JSON objects using .model_validate_json()
:
>>> new_employee_json = """
... {"employee_id":"d2e7b773-926b-49df-939a-5e98cbb9c9eb",
... "name":"Eric Slogrenta",
... "email":"eslogrenta@example.com",
... "date_of_birth":"1990-01-02",
... "salary":125000.0,
... "department":"HR",
... "elected_benefits":false}
... """
>>> new_employee = Employee.model_validate_json(new_employee_json)
>>> new_employee
Employee(
employee_id=UUID('d2e7b773-926b-49df-939a-5e98cbb9c9eb'),
name='Eric Slogrenta',
email='eslogrenta@example.com',
date_of_birth=datetime.date(1990, 1, 2),
salary=125000.0,
department=<Department.HR: 'HR'>,
elected_benefits=False
)
In this example, new_employee_json
is a valid JSON string that stores your employee fields, and you use .model_validate_json()
to validate and create an Employee
object from new_employee_json
. While it may seem subtle, the ability to create and validate Pydantic models from JSON is powerful because JSON is one of the most popular ways to transfer data across the web. This is one of the reasons why FastAPI relies on Pydantic to create REST APIs.
You can also serialize Pydantic models as dictionaries and JSON:
>>> new_employee.model_dump()
{
'employee_id': UUID('d2e7b773-926b-49df-939a-5e98cbb9c9eb'),
'name': 'Eric Slogrenta',
'email': 'eslogrenta@example.com',
'date_of_birth': datetime.date(1990, 1, 2),
'salary': 125000.0,
'department': <Department.HR: 'HR'>,
'elected_benefits': False
}
>>> new_employee.model_dump_json()
'{"employee_id":"d2e7b773-926b-49df-939a-5e98cbb9c9eb",
⮑"name":"Eric Slogrenta",
⮑"email":"eslogrenta@example.com",
⮑"date_of_birth":"1990-01-02",
⮑"salary":125000.0,
⮑"department":"HR",
⮑"elected_benefits":false}'
Here, you use .model_dump()
and .model_dump_json()
to convert your new_employee
model to a dictionary and JSON string, respectively. Notice how .model_dump_json()
returns a JSON object with date_of_birth
and department
stored as strings.
While Pydantic already validated these fields and converted your model to JSON, whoever uses this JSON downstream won’t know that date_of_birth
needs to be a valid date
and department
needs to be a category in your Department
enum. To solve this, you can create a JSON schema from your Employee
model.
JSON schemas tell you what fields are expected and what values are represented in a JSON object. You can think of this as the JSON version of your Employee
class definition. Here’s how you generate a JSON schema for Employee
:
>>> Employee.model_json_schema()
{
'$defs': {
'Department': {
'enum': ['HR', 'SALES', 'IT', 'ENGINEERING'],
'title': 'Department',
'type': 'string'
}
},
'properties': {
'employee_id': {
'default': '73636d47-373b-40cd-a005-4819a84d9ea7',
'format': 'uuid',
'title': 'Employee Id',
'type': 'string'
},
'name': {'title': 'Name', 'type': 'string'},
'email': {
'format': 'email',
'title': 'Email',
'type': 'string'
},
'date_of_birth': {
'format': 'date',
'title': 'Date Of Birth',
'type': 'string'
},
'salary': {'title': 'Salary', 'type': 'number'},
'department': {'$ref': '#/$defs/Department'},
'elected_benefits': {'title': 'Elected Benefits', 'type': 'boolean'}
},
'required': [
'name',
'email',
'date_of_birth',
'salary',
'department',
'elected_benefits'
],
'title': 'Employee',
'type': 'object'
}
When you call .model_json_schema()
, you get a dictionary representing your model’s JSON schema. The first entry you see shows you the values that department
can take on. You also see information about how your fields should be formatted. For instance, according to this JSON schema, employee_id
is expected to be a UUID and date_of_birth
is expected to be a date.
You can convert your JSON schema to a JSON string using json.dumps()
, which enables just about any programming language to validate JSON objects produced by your Employee
model. In other words, not only can Pydantic validate incoming data and serialize it as JSON, but it also provides other programming languages with the information they need to validate your model’s data via JSON schemas.
With that, you now understand how to use Pydantic’s BaseModel
to validate and serialize your data. Up next, you’ll learn how to use fields to further customize your validation.
Using Fields for Customization and Metadata
So far, your Employee
model validates the data type of each field and ensures some of the fields, such as email
, date_of_birth
, and department
, take on valid formats. However, let’s say you also want to ensure that salary
is a positive number, name
isn’t an empty string, and email
contains your company’s domain name. You can use Pydantic’s Field
class to accomplish this.
The Field
class allows you to customize and add metadata to your model’s fields. To see how this works, take a look at this example:
pydantic_models.py
from datetime import date
from uuid import UUID, uuid4
from enum import Enum
from pydantic import BaseModel, EmailStr, Field
class Department(Enum):
HR = "HR"
SALES = "SALES"
IT = "IT"
ENGINEERING = "ENGINEERING"
class Employee(BaseModel):
employee_id: UUID = Field(default_factory=uuid4, frozen=True)
name: str = Field(min_length=1, frozen=True)
email: EmailStr = Field(pattern=r".+@example\.com$")
date_of_birth: date = Field(alias="birth_date", repr=False, frozen=True)
salary: float = Field(alias="compensation", gt=0, repr=False)
department: Department
elected_benefits: bool
Here, you import Field
along with the other dependencies you used previously, and you assign default values to some of the Employee
fields. Here’s a breakdown of the Field
parameters you used to add additional validation and metadata to your fields:
default_factory
: You use this to define a callable that generates default values. In the example above, you setdefault_factory
touuid4
. This callsuuid4()
to generate a random UUID foremployee_id
when needed. You can also use a lambda function for more flexibility.frozen
: This is a Boolean parameter you can set to make your fields immutable. This means, whenfrozen
is set toTrue
, the corresponding field can’t be changed after your model is instantiated. In this example,employee_id
,name
, anddate_of_birth
are made immutable using thefrozen
parameter.min_length
: You can control the length of string fields withmin_length
andmax_length
. In the example above, you ensure thatname
is at least one character long.pattern
: For string fields, you can setpattern
to a regex expression to match whatever pattern you’re expecting for that field. For instance, when you use the regex expression in the example above foremail
, Pydantic will ensure that every email ends with@example.com
.alias
: You can use this parameter when you want to assign an alias to your fields. For example, you can allowdate_of_birth
to be calledbirth_date
orsalary
to be calledcompensation
. You can use these aliases when instantiating or serializing a model.gt
: This parameter, short for “greater than”, is used for numeric fields to set minimum values. In this example, settinggt=0
ensuressalary
is always a positive number. Pydantic also has other numeric constraints, such aslt
which is short for “less than”.repr
: This Boolean parameter determines whether a field is displayed in the model’s field representation. In this example, you won’t seedate_of_birth
orsalary
when you print anEmployee
instance.
To see this extra validation in action, notice what happens when you try to create an Employee
model with incorrect data:
>>> from pydantic_models import Employee
>>> incorrect_employee_data = {
... "name": "",
... "email": "cdetuma@fakedomain.com",
... "birth_date": "1998-04-02",
... "salary": -10,
... "department": "IT",
... "elected_benefits": True,
... }
>>> Employee.model_validate(incorrect_employee_data)
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 3 validation errors for
Employee
name
String should have at least 1 character [type=string_too_short,
input_value='', input_type=str] For further information visit
https://errors.pydantic.dev/2.6/v/string_too_short
email
String should match pattern '.+@example\.com$'
[type=string_pattern_mismatch,
input_value='cdetuma@fakedomain.com', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/string_pattern_mismatch
salary
Input should be greater than 0 [type=greater_than, input_value=-10,
input_type=int] For further information visit
https://errors.pydantic.dev/2.6/v/greater_than
Here, you import your updated Employee
model and attempt to validate a dictionary with incorrect data. In response, Pydantic gives you three validation errors saying the name
needs to be at least one character, email
should match your company’s domain name, and salary
should be greater than zero.
Now notice the additional features you get when you validate correct Employee
data:
>>> employee_data = {
... "name": "Clyde Harwell",
... "email": "charwell@example.com",
... "birth_date": "2000-06-12",
... "compensation": 100_000,
... "department": "ENGINEERING",
... "elected_benefits": True,
... }
>>> employee = Employee.model_validate(employee_data)
>>> employee
Employee(
employee_id=UUID('614c6f75-8528-4272-9cfc-365ddfafebd9'),
name='Clyde Harwell',
email='charwell@example.com',
department=<Department.ENGINEERING: 'ENGINEERING'>,
elected_benefits=True)
>>> employee.salary
100000.0
>>> employee.date_of_birth
datetime.date(2000, 6, 12)
In this block, you create a dictionary and an Employee
model with .model_validate()
. In employee_data
, notice how you used birth_date
instead of date_of_birth
and compensation
instead of salary
. Pydantic recognizes these aliases and assigns their values to the correct field name internally.
Because you set repr=False
, you can see that salary
and date_of_birth
aren’t displayed in the Employee
representation. You have to explicitly access them as attributes to see their values. Lastly, notice what happens when you try to change a frozen field:
>>> employee.department = "HR"
>>> employee.name = "Andrew TuGrendele"
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 1
validation error for Employee
name
Field is frozen [type=frozen_field, input_value='Andrew TuGrendele',
input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/frozen_field
Here, you first change the value of department
from IT
to HR
. This is perfectly acceptable because department
isn’t a frozen field. However, when you try to change name
, Pydantic gives you an error saying that name
is a frozen field.
You now have a solid grasp of Pydantic’s BaseModel
and Field
classes. With these alone, you can define many different validation rules and metadata on your data schemas, but sometimes this isn’t enough. Up next, you’ll take your field validation even further with Pydantic validators.
Working With Validators
Up to this point, you’ve used Pydantic’s BaseModel
to validate model fields with predefined types, and you incorporated Field
to further customize your validation. While you can get pretty far with BaseModel
and Field
alone, for more complicated validation scenarios that require custom logic, you’ll need to use Pydantic validators.
With validators, you can execute just about any validation logic that you can express in a function. You’ll see how to do this next.
Validating Models and Fields
Continuing with the employee example, suppose your company has a policy that they only hire employees who are at least eighteen years old. Any time you create a new Employee
object, you need to make sure the employee is older than eighteen. To handle this, you could add an age
field and use the Field
class to enforce that the employee is at least eighteen. However, this seems redundant since you’ve already stored the employee’s birth date.
A better solution is to use a Pydantic field validator. Field validators allow you to apply custom validation logic to your BaseModel
fields by adding class methods to your model. To enforce that all employees are at least eighteen, you can add the following Field
validator to your Employee
model:
pydantic_models.py
from datetime import date
from uuid import UUID, uuid4
from enum import Enum
from pydantic import BaseModel, EmailStr, Field, field_validator
class Department(Enum):
HR = "HR"
SALES = "SALES"
IT = "IT"
ENGINEERING = "ENGINEERING"
class Employee(BaseModel):
employee_id: UUID = Field(default_factory=uuid4, frozen=True)
name: str = Field(min_length=1, frozen=True)
email: EmailStr = Field(pattern=r".+@example\.com$")
date_of_birth: date = Field(alias="birth_date", repr=False, frozen=True)
salary: float = Field(alias="compensation", gt=0, repr=False)
department: Department
elected_benefits: bool
@field_validator("date_of_birth")
@classmethod
def check_valid_age(cls, date_of_birth: date) -> date:
today = date.today()
eighteen_years_ago = date(today.year - 18, today.month, today.day)
if date_of_birth > eighteen_years_ago:
raise ValueError("Employees must be at least 18 years old.")
return date_of_birth
In this block, you import field_validator
and use it to decorate a class method in Employee
called .check_valid_age()
. Field validators must be defined a class methods. In .check_valid_age()
, you calculate today’s date but eighteen years ago. If the employee’s date_of_birth
is after that date, an error is raised.
To see how this validator works, check out this example:
>>> from pydantic_models import Employee
>>> from datetime import date, timedelta
>>> young_employee_data = {
... "name": "Jake Bar",
... "email": "jbar@example.com",
... "birth_date": date.today() - timedelta(days=365 * 17),
... "compensation": 90_000,
... "department": "SALES",
... "elected_benefits": True,
... }
>>> Employee.model_validate(young_employee_data)
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError:
1 validation error for Employee
birth_date
Value error, Employees must be at least 18 years old.
[type=value_error, input_value=datetime.date(2007, 4, 10),
input_type=date]
For further information visit
https://errors.pydantic.dev/2.6/v/value_error
In this example, you specify a birth_date
that is seventeen years behind the current date. When you call .model_validate()
to validate young_employee_data
, you get an error saying employees must be at least eighteen years old.
As you can imagine, Pydantic’s field_validator()
enables you to arbitrarily customize field validation. However, field_validator()
won’t work if you want to compare multiple fields to one another or validate your model as a whole. For this, you’ll need to use model validators.
As an example, suppose your company only hires contract workers in the IT department. Because of this, IT workers don’t qualify for benefits and their elected_benefits
field should be False
. You can use Pydantic’s model_validator()
to enforce this constraint:
pydantic_models.py
from typing import Self
from datetime import date
from uuid import UUID, uuid4
from enum import Enum
from pydantic import (
BaseModel,
EmailStr,
Field,
field_validator,
model_validator,
)
class Department(Enum):
HR = "HR"
SALES = "SALES"
IT = "IT"
ENGINEERING = "ENGINEERING"
class Employee(BaseModel):
employee_id: UUID = Field(default_factory=uuid4, frozen=True)
name: str = Field(min_length=1, frozen=True)
email: EmailStr = Field(pattern=r".+@example\.com$")
date_of_birth: date = Field(alias="birth_date", repr=False, frozen=True)
salary: float = Field(alias="compensation", gt=0, repr=False)
department: Department
elected_benefits: bool
@field_validator("date_of_birth")
@classmethod
def check_valid_age(cls, date_of_birth: date) -> date:
today = date.today()
eighteen_years_ago = date(today.year - 18, today.month, today.day)
if date_of_birth > eighteen_years_ago:
raise ValueError("Employees must be at least 18 years old.")
return date_of_birth
@model_validator(mode="after")
def check_it_benefits(self) -> Self:
department = self.department
elected_benefits = self.elected_benefits
if department == Department.IT and elected_benefits:
raise ValueError(
"IT employees are contractors and don't qualify for benefits"
)
return self
Here, you add Python’s Self
type and Pydantic’s model_validator()
to your imports. You then create a method, .check_it_benefits()
, that raises an error if the employee belongs to the IT department and the elected_benefits
field is True
. When you set mode
to after
in @model_validator
, Pydantic waits until after you’ve instantiated your model to run .check_it_benefits()
.
Note: You may have noticed that .check_it_benefits()
is annotated with Python’s Self
type. This is because .check_it_benefits()
returns the Employee
class instance, and the Self
type is the preferred annotation for this. If you’re using a Python version less than 3.11, you’ll have to import the Self
type from typing_extensions
.
To see your new model validator in action, check out this example:
>>> from pydantic_models import Employee
>>> new_employee = {
... "name": "Alexis Tau",
... "email": "ataue@example.com",
... "birth_date": "2001-04-012",
... "compensation": 100_000,
... "department": "IT",
... "elected_benefits": True,
... }
>>> Employee.model_validate(new_employee)
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 1 validation error for
Employee
Value error, IT employees are contractors and don't qualify for
benefits.
[type=value_error, input_value={'name': 'Alexis Tau',
...elected_benefits': True},
input_type=dict]
For further information visit
https://errors.pydantic.dev/2.6/v/value_error
In this example, you try to create an Employee
model with an IT
department and elected_benefits
set to True
. When you call .model_validate()
, Pydantic throws an error letting you know that IT employees don’t qualify for benefits because they’re contractors.
With model and field validators, you can implement just about any custom validation you can think of. You should now have a solid foundation to create Pydantic models for your own use cases. Up next, you’ll shift gears and look at how you can use Pydantic to validate arbitrary functions, not just BaseModel
fields.
Using Validation Decorators to Validate Functions
While BaseModel
is Pydantic’s bread and butter class for validating data schemas, you can also use Pydantic to validate function arguments using the @validate_call
decorator. This allows you to create robust functions with informative type errors without having to manually implement validation logic.
To see how this works, suppose you’re writing a function that sends invoices to clients after they’ve made a purchase. Your function takes in the client’s name, email, items purchased, and total billing amount, and it constructs and sends them an email. You need to validate all of these inputs because getting them wrong could result in the email not being sent, being misformatted, or the client being invoiced incorrectly.
To accomplish this, you write the following function:
validate_functions.py
import time
from typing import Annotated
from pydantic import PositiveFloat, Field, EmailStr, validate_call
@validate_call
def send_invoice(
client_name: Annotated[str, Field(min_length=1)],
client_email: EmailStr,
items_purchased: list[str],
amount_owed: PositiveFloat,
) -> str:
email_str = f"""
Dear {client_name}, \n
Thank you for choosing xyz inc! You
owe ${amount_owed:,.2f} for the following items: \n
{items_purchased}
"""
print(f"Sending email to {client_email}...")
time.sleep(2)
return email_str
First, you import the dependencies needed to write and annotate send_invoice()
. You then create send_invoice()
decorated with @validate_call
. Before executing send_invoice()
, @validate_call
ensures that each input conforms to your annotations. In this case, @validate_call
checks whether client_name
has at least one character, client_email
is properly formatted, items_purchased
is a list of strings, and amount_owed
is a positive float.
If one of the inputs doesn’t conform to your annotation, Pydantic will throw an error similar to what you’ve seen already with BaseModel
. If all the inputs are valid, send_invoice()
creates a string and simulates sending it to your client with time.sleep(2)
.
Note: You might have noticed that client_name
is annotated with Python’s Annotated
type. In general, you can use Annotated
when you want to provide metadata about a function argument. Pydantic recommends using Annotated
when you need to validate a function argument that has metadata specified by Field
.
However, if you use default_factory
to assign a default value to your function argument, you should assign the argument directly to a Field
instance. You can see an example of this in Pydantic’s documentation.
To see @validate_call
and send_invoice()
in action, open a new Python REPL and run the following code:
>>> from validate_functions import send_invoice
>>> send_invoice(
... client_name="",
... client_email="ajolawsonfakedomain.com",
... items_purchased=["pie", "cookie", 17],
... amount_owed=0,
... )
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 4 validation errors for
send_invoice
client_name
String should have at least 1 character [type=string_too_short,
input_value='', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/string_too_short
client_email
value is not a valid email address: The email address is not valid.
It must have exactly one @-sign. [type=value_error,
input_value='ajolawsonfakedomain.com', input_type=str]
items_purchased.2
Input should be a valid string [type=string_type, input_value=17,
input_type=int]
For further information visit
https://errors.pydantic.dev/2.6/v/string_type
amount_owed
Input should be greater than 0 [type=greater_than, input_value=0,
input_type=int]
For further information visit
https://errors.pydantic.dev/2.6/v/greater_than
In this example, you import send_invoice()
and pass in invalid function arguments. Pydantic’s @validate_call
recognizes this and throws errors telling you that client_name
needs at least one character, client_email
is invalid, items_purchased
should contain strings, and amount_owed
should be greater than zero.
When you pass in valid inputs, send_invoice()
runs as expected:
>>> email_str = send_invoice(
... client_name="Andrew Jolawson",
... client_email="ajolawson@fakedomain.com",
... items_purchased=["pie", "cookie", "cake"],
... amount_owed=20,
... )
Sending email to ajolawson@fakedomain.com...
>>> print(email_str)
Dear Andrew Jolawson,
Thank you for choosing xyz inc! You
owe $20.00 for the following items:
['pie', 'cookie', 'cake']
While @validate_call
isn’t as flexible as BaseModel
, you can still use it to apply powerful validation to your function arguments. This saves you a lot of time and lets you avoid writing boilerplate type checking and validation logic. If you’ve done this before, you know how cumbersome it can be to write assert
statements for each of your function arguments. For many use cases, @validate_call
takes care of this for you.
In the final section of this tutorial, you’ll learn how you can use Pydantic for settings management and configuration.
Managing Settings
One of the most popular ways to configure Python applications is with environment variables. An environment variable is a variable that lives in the operating system, outside of your Python code, but can be read by your code or other programs. Examples of data you’d want to store as environment variables include secret keys, database credentials, API credentials, server addresses, and access tokens.
Environment variables often change between development and production, and many contain sensitive information. Because of this, you need a robust way to parse, validate, and integrate environment variables in your code. This is a perfect use case of pydantic-settings
, and that’s what you’ll explore in this section.
Configuring Applications With BaseSettings
pydantic-settings
is one of the most powerful ways to manage environment variables in Python, and it has been widely adopted and recommended by popular libraries like FastAPI. You can use pydantic-settings
to create models, similar to BaseModel
, that parse and validate environment variables.
The main class in pydantic-settings
is BaseSettings
, and it has all of the same functionalities as BaseModel
. However, if you create a model that inherits from BaseSettings
, the model initializer will try to read any fields not passed as keyword arguments from environment variables.
To see how this works, suppose your application connects to a database and another API service. Your database credentials and API key can change over time and often change depending on which environment you’re deploying in. To handle this, you can create the following BaseSettings
model:
settings_management.py
from pydantic import HttpUrl, Field
from pydantic_settings import BaseSettings
class AppConfig(BaseSettings):
database_host: HttpUrl
database_user: str = Field(min_length=5)
database_password: str = Field(min_length=10)
api_key: str = Field(min_length=20)
In this script, you import the dependencies needed to create your BaseSettings
model. Notice that you import BaseSettings
from pydantic_settings
with an underscore instead of a dash. You then define a model, AppConfig
, that inherits from BaseSettings
and stores fields about your database and API key. In this example, database_host
has to be a valid HTTP URL, and the remaining fields have a minimum length constraint.
Next, open a terminal and add the following environment variables. If you’re on Linux, macOS, or Windows Bash, you can do this with the export
command:
(venv) $ export DATABASE_HOST="http://somedatabaseprovider.us-east-2.com"
(venv) $ export DATABASE_USER="username"
(venv) $ export DATABASE_PASSWORD="asdfjl348ghl@9fhsl4"
(venv) $ export API_KEY="ajfsdla48fsdal49fj94jf93-f9dsal"
You can also set environment variables in Windows PowerShell. Then you can open a new Python REPL and instantiate AppConfig
:
>>> from settings_management import AppConfig
>>> AppConfig()
AppConfig(
database_host=Url('http://somedatabaseprovider.us-east-2.com/'),
database_user='username',
database_password='asdfjl348ghl@9fhsl4',
api_key='ajfsdla48fsdal49fj94jf93-f9dsal'
)
Notice how you don’t specify any field names when you instantiate AppConfig
. Instead, your BaseSettings
model reads fields from the environment variables you set. Also notice that you exported the environment variables in all capital letters, yet AppConfig
parsed and stored them successfully. This is because BaseSettings
is not case-sensitive when matching environment variables to field names.
Next, close your Python REPL and create invalid environment variables:
(venv) $ export DATABASE_HOST="somedatabaseprovider.us-east-2"
(venv) $ export DATABASE_USER="usee"
(venv) $ export DATABASE_PASSWORD="asdf"
(venv) $ export API_KEY="ajf"
Now open another Python REPL and reinstantiate AppConfig
:
>>> from settings_management import AppConfig
>>> AppConfig()
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 4 validation errors for
AppConfig
database_host
Input should be a valid URL, relative URL without a base
[type=url_parsing, input_value='somedatabaseprovider.us-east-2',
input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/url_parsing
database_user
String should have at least 5 characters [type=string_too_short,
input_value='usee', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/string_too_short
database_password
String should have at least 10 characters [type=string_too_short,
input_value='asdf', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/string_too_short
api_key
String should have at least 20 characters [type=string_too_short,
input_value='ajf', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/string_too_short
This time, when you try to instantiate AppConfig
, pydantic-settings
throws errors saying that the database_host
is not a valid URL, and the remaining fields don’t meet the minimum length constraint.
While this was a simplified configuration example, you can leverage BaseSettings
to parse and validate just about anything you need from your environment variables. Any validation you can do with BaseModel
you can also do with BaseSettings
, including custom validation with model and field validators.
Lastly, you’ll learn how to further customize the behavior of BaseSettings
with SettingsConfigDict
.
Customizing Settings With SettingsConfigDict
In the previous example, you saw a barebones example of how to create a BaseSettings
model that parses and validates environment variables. However, you might want to further customize the behavior of your BaseSettings
model, and you can accomplish this with SettingsConfigDict
.
Suppose you can’t manually export each of your environment variables, which is often the case, and you need to read them from a .env
file. You’d want to make sure that BaseSettings
is case-sensitive when parsing and that there are no additional environment variables in your .env
file aside from the ones you specify in your model. Here’s how you’d do that with SettingsConfigDict
:
settings_management.py
from pydantic import HttpUrl, Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class AppConfig(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=True,
extra="forbid",
)
database_host: HttpUrl
database_user: str = Field(min_length=5)
database_password: str = Field(min_length=10)
api_key: str = Field(min_length=20)
This script is the same as in the previous example, except this time you’ve imported SettingsConfigDict
and initialized it within AppConfig
. Within your SettingsConfigDict
, you specify that environment variables should be read from a .env
file, case-sensitivity should be enforced, and that extra environment variables are forbidden in the .env
file.
Next, create a file named .env
in the same directory as settings_management.py
, and populate it with the following environment variables:
database_host=http://somedatabaseprovider.us-east-2.com/
database_user=username
database_password=asdfjfffffl348ghl@9fhsl4
api_key=ajfsdla48fsdal49fj94jf93-f9dsal
Now, you can open a Python REPL and initialize your AppConfig
model:
>>> from settings_management import AppConfig
>>> AppConfig()
AppConfig(
database_host=Url('http://somedatabaseprovider.us-east-2.com/'),
database_user='username',
database_password='asdfjfffffl348ghl@9fhsl4',
api_key='ajfsdla48fsdal49fj94jf93-f9dsal'
)
As you can see, AppConfig
successfully parsed and validated the environment variables in your .env
file.
Lastly, add some invalid variables to your .env
file:
DATABASE_HOST=http://somedatabaseprovider.us-east-2.com/
database_user=username
database_password=asdfjfffffl348ghl@9fhsl4
api_key=ajfsdla48fsdal49fj94jf93-f9dsal
extra_var=shouldntbehere
Here, you’ve changed database_host
to DATABASE_HOST
, violating the case-sensitivity constraint, and you’ve added extra environment variables that shouldn’t be there. Here’s how your model responds when it tries to validate this:
>>> from settings_management import AppConfig
>>> AppConfig()
Traceback (most recent call last):
pydantic_core._pydantic_core.ValidationError: 3 validation errors for
AppConfig
database_host
Field required [type=missing, input_value={'database_user':
'userna..._var': 'shouldntbehere'}, input_type=dict]
For further information visit
https://errors.pydantic.dev/2.6/v/missing
DATABASE_HOST
Extra inputs are not permitted [type=extra_forbidden,
input_value='http://somedatabaseprovider.us-east-2.com/', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/extra_forbidden
extra_var
Extra inputs are not permitted [type=extra_forbidden,
input_value='shouldntbehere', input_type=str]
For further information visit
https://errors.pydantic.dev/2.6/v/extra_forbidden
You get a nice list of errors saying that database_host
is missing and that you have extra environment variables in your .env
file. Notice that because of the case-sensitivity constraint, your model thinks that DATABASE_HOST
is an extra variable along with extra_var
.
There’s a lot more you can do with SettingsConfigDict
and BaseSettings
more generally, but these examples should give you an idea of how you can use pydantic-settings
to manage your environment variables for your own use case.
Conclusion
Pydantic is an easy-to-use, fast, and widely-trusted data validation library in Python. You’ve gotten a broad overview of Pydantic, and now you have the knowledge and resources necessary to start using Pydantic in your own projects.
In this tutorial, you’ve learned:
- What Pydantic is and why it’s been so widely adopted
- How to install Pydantic
- How to parse, validate, and serialize data schemas with
BaseModel
and validators - How to write custom validation logic for functions using @validate_call
- How to parse and validate environment variables with
pydantic-settings
Pydantic makes your code more robust and trustworthy, and it partially bridges the gap between Python’s ease of use and the built-in data validation of statically typed languages. For just about any data parsing, validation, and serialization use case you might have, Pydantic has an elegant solution.
Get Your Code: Click here to download the free sample code that you’ll use to help you learn how Pydantic can help you simplify data validation in Python.
Take the Quiz: Test your knowledge with our interactive “Pydantic: Simplifying Data Validation in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Pydantic: Simplifying Data Validation in PythonIn this quiz, you'll test your understanding of Pydantic, a powerful data validation library for Python. You'll revisit concepts such as working with data schemas, writing custom validators, validating function arguments, and managing settings with pydantic-settings.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Using Pydantic to Simplify Python Data Validation