GeoPandas Basics: Maps, Projections, and Spatial Joins

GeoPandas Basics: Maps, Projections, and Spatial Joins

GeoPandas extends pandas to make working with geospatial data in Python intuitive and powerful. If you’re looking to do geospatial tasks in Python and want a library with a pandas-like API, then GeoPandas is an excellent choice. This tutorial shows you how to accomplish four common geospatial tasks: reading in data, mapping it, applying a projection, and doing a spatial join.

By the end of this tutorial, you’ll understand that:

  • GeoPandas extends pandas with support for spatial data. This data typically lives in a geometry column and allows spatial operations such as projections and spatial joins, while Folium focuses on richer interactive web maps after data preparation.
  • You inspect CRS with .crs and reproject data using .to_crs() with an authority code like EPSG:4326 or ESRI:54009.
  • A geographic CRS stores longitude and latitude in degrees, while a projected CRS uses linear units like meters or feet for area and distance calculations.
  • Spatial joins use .sjoin() with predicates like "within" or "intersects", and both inputs must share the same CRS or the relationships will be computed incorrectly.

Here’s how GeoPandas compares with alternative libraries:

Use Case Pick pandas Pick Folium Pick GeoPandas
Tabular data analysis -
Mapping -
Projections, spatial joins - -

GeoPandas builds on pandas by adding support for geospatial data and operations like projections and spatial joins. It also includes tools for creating maps. Folium complements this by focusing on interactive, web-based maps that you can customize more deeply.

Getting Started With GeoPandas

You’ll first prepare your environment and load a small dataset that you’ll use throughout the tutorial. In the next two subsections, you’ll install the necessary packages and read in a sample dataset of New York City borough boundaries. This gives you a concrete GeoDataFrame to explore as you learn the core concepts.

Installing GeoPandas

This tutorial uses two packages: geopandas for working with geographic data and geodatasets for loading sample data. It’s a good idea to install these packages inside a virtual environment so your project stays isolated from the rest of your system and you can manage its dependencies cleanly.

Once your virtual environment is active, you can install both packages with pip:

Shell
$ python -m pip install "geopandas[all]" geodatasets

Using the [all] option ensures you have everything needed for reading data, transforming coordinate systems, and creating plots. For most readers, this will work out of the box.

If you do run into installation issues, the project’s maintainers provide alternative installation options on the official installation page.

Reading in Data

Most geospatial datasets come in GeoJSON or shapefile format. The read_file() function can read both, and it accepts either a local file path or a URL.

In the example below, you’ll use read_file() to load the New York City Borough Boundaries (NYBB) dataset. The geodatasets package provides a convenient path to this dataset, so you don’t need to download anything manually. You’ll also drop unnecessary columns:

Python
>>> import geopandas as gpd
>>> import matplotlib.pyplot as plt
>>> from geodatasets import get_path
>>> path_to_data = get_path("nybb")
>>> nybb = gpd.read_file(path_to_data)
>>> nybb = nybb[["BoroName", "Shape_Area", "geometry"]]
>>> nybb
    BoroName        Shape_Area      geometry
0   Staten Island   1.623820e+09    MULTIPOLYGON (((970217.022 145643.332, ....
1   Queens          3.045213e+09    MULTIPOLYGON (((1029606.077 156073.814, ...
2   Brooklyn        1.937479e+09    MULTIPOLYGON (((1021176.479 151374.797, ...
3   Manhattan       6.364715e+08    MULTIPOLYGON (((981219.056 188655.316, ....
4   Bronx           1.186925e+09    MULTIPOLYGON (((1012821.806 229228.265, ...
>>> type(nybb)
<class 'geopandas.geodataframe.GeoDataFrame'>
>>> type(nybb["geometry"])
<class 'geopandas.geoseries.GeoSeries'>

nybb is a GeoDataFrame. A GeoDataFrame has rows, columns, and all the methods of a pandas DataFrame. The difference is that it typically includes a special geometry column, which stores geographic shapes instead of plain numbers or text.

The geometry column is a GeoSeries. It behaves like a normal pandas Series, but its values are spatial objects that you can map and run spatial queries against. In the nybb dataset, each borough’s geometry is a MultiPolygon—a shape made of several polygons—because every borough consists of multiple islands. Soon you’ll use these geometries to make maps and run spatial operations, such as finding which borough a point falls inside.

Mapping Data

Once you’ve loaded a GeoDataFrame, one of the quickest ways to understand your data is to visualize it. In this section, you’ll learn how to create both static and interactive maps. This allows you to inspect shapes, spot patterns, and confirm that your geometries look the way you expect.

Creating Static Maps

As explained in the guide to plotting with pandas, pandas DataFrames have a .plot() method for creating quick visualizations like scatter plots or bar charts. The GeoDataFrame class overrides .plot() so that, when your data contains geometry, the result is a map instead of a chart.

Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

Locked learning resources

The full article is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Article

Already a member? Sign-In

About Ari Lamstein

Ari is an avid Pythonista and Real Python contributor.

» More about Ari

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

What Do You Think?

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Become a Member to join the conversation.

Keep Learning

Related Topics: intermediate data-science