How to Drop Null Values in pandas

How to Drop Null Values in pandas

Missing values can derail your analysis. In pandas, you can use the .dropna() method to remove rows or columns containing null values—in other words, missing data—so you can work with clean DataFrames. In this tutorial, you’ll learn how this method’s parameters let you control exactly which data gets removed. As you’ll see, these parameters give you fine-grained control over how much of your data to clean.

Dealing with null values is essential for keeping datasets clean and avoiding the issues they can cause. Missing entries can lead to misinterpreted column data types, inaccurate conclusions, and errors in calculations. Simply put, nulls can cause havoc if they find their way into your calculations.

By the end of this tutorial, you’ll understand that:

  • You can use .dropna() to remove rows and columns from a pandas DataFrame.
  • You can remove rows and columns based on the content of a subset of your DataFrame.
  • You can remove rows and columns based on the volume of null values within your DataFrame.

To get the most out of this tutorial, it’s recommended that you already have a basic understanding of how to create pandas DataFrames from files.

You’ll use the Python REPL along with a file named sales_data_with_missing_values.csv, which contains several null values you’ll deal with during the exercises. Before you start, extract this file from the downloadable materials by clicking the link at the end of this section.

The sales_data_with_missing_values.csv file is based on the publicly available and complete sales data file from Kaggle. Understanding the file’s content isn’t essential for this tutorial, but you can explore the Kaggle link above for more details if you’d like.

You’ll also need to install both the pandas and PyArrow libraries to make sure all code examples work in your environment:

Windows PowerShell
PS> python -m pip install pandas pyarrow
Shell
$ python -m pip install pandas pyarrow

It’s time to refine your pandas skills by learning how to handle missing data in a variety of ways.

You’ll find all code examples and the sales_data_with_missing_values.csv file in the materials for this tutorial, which you can download by clicking the link below:

Take the Quiz: Test your knowledge with our interactive “How to Drop Null Values in pandas” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

How to Drop Null Values in pandas

Quiz yourself on pandas .dropna(): remove nulls, clean missing data, and prepare DataFrames for accurate analysis.

How to Drop Rows Containing Null Values in pandas

Before you start dropping rows, it’s helpful to know what options .dropna() gives you. This method supports six parameters that let you control exactly what’s removed:

  • axis: Specifies whether to remove rows or columns containing null values.
  • thresh and how: Define how many missing values to remove or retain.
  • subset: Limits the removal of null values to specific parts of your DataFrame.
  • inplace: Determines whether the operation modifies the original DataFrame or returns a new copy.
  • ignore_index: Resets the DataFrame index after removing rows.

Don’t worry if any of these parameters don’t make sense to you just yet—you’ll learn why each is used during this tutorial. You’ll also get the chance to practice your skills.

Before using .dropna() to drop rows, you should first find out whether your data contains any null values:

Python
>>> import pandas as pd

>>> pd.set_option("display.max_columns", None)

>>> sales_data = pd.read_csv(
...     "sales_data_with_missing_values.csv",
...     parse_dates=["order_date"],
...     date_format="%d/%m/%Y",
... ).convert_dtypes(dtype_backend="pyarrow")

>>> sales_data
    order_number           order_date       customer_name  \
0           <NA>  2025-02-09 00:00:00      Skipton Fealty
1          70041                 <NA>  Carmine Priestnall
2          70042  2025-02-09 00:00:00                <NA>
3          70043  2025-02-10 00:00:00     Lanni D'Ambrogi
4          70044  2025-02-10 00:00:00         Tann Angear
5          70045  2025-02-10 00:00:00      Skipton Fealty
6          70046  2025-02-11 00:00:00             Far Pow
7          70047  2025-02-11 00:00:00          Hill Group
8          70048  2025-02-11 00:00:00         Devlin Nock
9           <NA>                 <NA>                <NA>
10         70049  2025-02-12 00:00:00           Swift Inc

                product_purchased discount  sale_price
0    Chili Extra Virgin Olive Oil     True       135.0
1                            <NA>     <NA>       150.0
2       Rosemary Olive Oil Candle    False        78.0
3                            <NA>     True        19.5
4    Vanilla and Olive Oil Candle     <NA>       13.98
5    Basil Extra Virgin Olive Oil     True        <NA>
6    Chili Extra Virgin Olive Oil    False       150.0
7    Chili Extra Virgin Olive Oil     True       135.0
8   Lavender and Olive Oil Lotion    False       39.96
9                            <NA>     <NA>        <NA>
10  Garlic Extra Virgin Olive Oil     True       936.0

To make sure all columns appear on your screen, you configure pd.set_option("display.max_columns", None). By passing None as the second parameter, you make sure all columns are displayed.

You read the sales_data_with_missing_values.csv file into a DataFrame using the pandas read_csv() function, then view the data. The order dates are in the "%d/%m/%Y" format in the file, so to make sure the order_date data is read correctly, you use both the parse_dates and date_format parameters. The output reveals there are ten rows and six columns of data in your file.

Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

Locked learning resources

The full article is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

About Ian Eyre

Ian is an avid Pythonista and Real Python contributor who loves to learn and teach others.

» More about Ian

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

What Do You Think?

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Become a Member to join the conversation.

Keep Learning

Related Topics: basics data-science python