Episode 224: Narwhals: Expanding DataFrame Compatibility Between Libraries
The Real Python Podcast
Oct 18, 2024 1h
Guest
How does a Python tool support all types of DataFrames and their various features? Could a lightweight library be used to add compatibility for newer formats like Polars or PyArrow? This week on the show, we speak with Marco Gorelli about his project, Narwhals.
Episode Sponsor:
Narwhals is a project aimed at library maintainers rather than end users. We discuss how the added compatibility benefits users by supporting modern features like lazy evaluation. We cover several projects Marco has been working with to implement Narwhals, including Altair, scikit-lego, and Ibis.
We also discuss how Marco started contributing to open-source projects. Marco has contributed to both pandas and Polars, which helps explain his interest in growing compatibility between libraries. He also offers advice on making your first contribution.
This episode is sponsored by CodeRabbit.
Course Spotlight: Differences Between Python’s Mutable and Immutable Types
In this video course, you’ll learn how Python’s mutable and immutable data types work internally and how you can take advantage of mutability or immutability to power your code.
Topics:
- 00:00:00 – Introduction
- 00:02:02 – Euro SciPy 2024 and sprints
- 00:04:04 – How did you get involved in open source?
- 00:07:18 – Finding a good issue to get started
- 00:09:25 – Discord and open-source projects
- 00:11:12 – Who would you describe Narwhals?
- 00:16:47 – Working on Polars
- 00:19:17 – Apache Arrow and a data interchange protocol
- 00:22:55 – Sponsor: CodeRabbit
- 00:23:55 – Digging into eager vs lazy
- 00:27:04 – Ibis DataFrame library
- 00:28:57 – What do libraries need from Narwhals?
- 00:34:57 – The scikit-lego library
- 00:37:15 – Video Course Spotlight
- 00:38:45 – Other libraries interested in Narwhals
- 00:41:56 – Compatibility policy
- 00:45:18 – What should an end user expect?
- 00:46:32 – Have other projects that attempted this?
- 00:47:54 – Keeping the project light and pure Python
- 00:49:32 – Contributors and how to get involved
- 00:54:42 – What are you excited about in the world of Python?
- 00:57:18 – What do you want to learn next?
- 00:59:05 – How can people follow your work online?
- 00:59:27 – Thanks and goodbye
Show Links:
- Narwhals
- EuroSciPy
- narwhals: Lightweight and Extensible Compatibility Layer Between DataFrame Libraries! - GitHub
- DataFrame Interoperability - What’s Been Achieved, and What Comes Next? - PyCon Lithuania - YouTube
- How Narwhals Has Many End Users … That Never Use It Directly - YouTube
- Polars Has a New Lightweight Plotting Backend - Altair
- pandas - Python Data Analysis Library
- Polars — DataFrames for the new era
- great-tables - PyPI
- Episode #214: Build Captivating Display Tables in Python With Great Tables
- Ibis
- Episode #201: Decoupling Systems to Get Closer to the Data
- Great Tables is Now BYODF (Bring Your Own DataFrame)
- How Narwhals and scikit-lego Came Together to Achieve DataFrame-Agnosticism
- Explore Using Narwhals in Plotly Express · Issue #4749 - GitHub
- Fairlearn
- Perfect Backwards Compatibility Policy - Narwhals
- uv: Unified Python packaging
- pixi - Powerful Development Environments
- Narwhals - Discord
- marcogorelli (@marcogorelli@fosstodon.org) - Fosstodon
- Marco Gorelli - Quansight - LinkedIn