Planning Your Package Structure
In this course, you’ll be working with the Real Python Reader app. It’s available from PyPI as a package or as source. But for this course, it’s best to grab the source code from GitHub:
Get Source Code: Click here to get access to the source code for the Real Python Feed Reader that you’ll use to publish an open-source package to PyPI.
You’ll also want to create an account on PyPI. To play around, you’ll need to get set up on TestPyPI too.
00:00 In the previous lesson, I covered a brief history of how the Python world got to its current packaging state, how that relates to virtual environments, and made bad movie music with my piehole. In this lesson, I’ll show you what is actually in a Python package and promise not to sing.
00:17 There’s an old Carl Sagan quote: To make an apple pie, first you need a universe. I won’t go that far out, but you are going to need a package to be able to play with it. In that light, may I present to the Real Python Reader, which I’ll be using to demonstrate packaging and packaging tools throughout this course. The project is used in multiple Real Python courses and articles, so you may have seen versions of it before.
00:41 What it does is read Real Python’s RSS feed and shows a summary of the articles on the site. It’s available as a PyPI package in both installable and source format, but if you’re going to play with it, you’re probably better off grabbing the source from the GitHub repo.
00:58 The second link is a quick download of the contents in ZIP format. Clickable links are in the description below.
01:08 The Reader program connects to the Internet over HTTPS. This is the part in the course where I humbly admit I’m not that great at following instructions, and I’ve learned that I’m not alone in that case.
01:19 If you are on a mac, and you used python.org to install your Python, you may have done the same thing as me, which is missed the final separate step of installation.
01:29 That would mean your Python is absent some important certificates and won’t be able to connect via HTTPS, and you can’t use this program. If you’re in the same situation as I am, I’ll walk you through it.
01:41 If you’re not, you can skip to this timestamp (2:58) and miss all the fun.
01:48 This is a screenshot of the Python installer on macOS. Note the print at the bottom here that says there’s a final step after the install needed to get the appropriate SSL certificates on your machine.
02:00 And this is the final page of the GUI installer. It also tells you where to find the script which installs the SSL. Unfortunately, it’s after the big, bold Congratulations! which is about where I stopped reading.
02:13 What this page is talking about is a script inside of the Python folder in your applications directory. If you go off to Finder, you’ll see this, and that’s the script you need to run.
02:26 Double-click it, and it will update the certificates inside of your Python instance, and you’ll be good to go. Skip this step like I did, and you’ll get a crash or a warning error depending on what version of the reader you’ve downloaded. This isn’t specific to this course.
02:43 This is making sure you’ve got your certificates working in Python generally, so anytime you went to use SSL, you would need to do this step. Hopefully, you follow instructions better than I do. That out of the way, let’s go on to play with the reader.
02:59
For those of you who skipped ahead, wow, was that a party, and did you miss it! Anyway, I’m here in my terminal, and I’m going to use the -m
argument to run the reader
module.
03:11
This is what the reader is supposed to do without any additional command-line arguments. Who doesn’t like a good argument? reader
shows a listing of Real Python articles from the site’s RSS feed.
03:23 I can take any of those numbers on the left-hand side and pass it in as an argument,
03:31 and the result is the article description or, in this case, the podcast description. Let’s look at the content of the package that I’ve just run.
03:44
tree
is a third-party tool. It’s not installed by default. It just shows the directory listing in a pretty way. Here you can see the complete package listing.
03:53
The recommended way of storing your code in a package now is to use the src/
directory. I don’t actually like this personally, as I’ve never once written a package that has more than a single module inside of it, and so this just means, like, an extra layer, but if you don’t use the src/
directory, which seems to be the way everyone’s doing it now, you sometimes need extra configuration to get the tools working, so editorializing aside, I just suck it up and have gone with the src/
directory so that I don’t have to fight things.
04:21
Inside that src/
directory is the actual code module, and because it is a module, it has a __init__
file, and because I want to be able to run it using the -m
, it’s got a __main__
file as well.
04:35
The rest of this is just the code and a config
file that the reader uses that does the actual work. Okay, that’s the module. How about the packaging stuff?
04:44
Depending on what version of reader
you’ve downloaded, you may or may not see a setup.py
file. The original packaging mechanism used this script to do installation, and some situations still require it. When the reader was written, this file was needed to do an editable install. As of this recording, that’s not necessary anymore. If all that sounded like gibberish, don’t worry.
05:06
I’ll come back and explain it later. Short version, this file isn’t needed anymore and may or may not be in your package. As setup.py
started to go out of fashion, it was replaced with setup.config
. Again, depending on what version of reader you’ve got and where you got it, this file may not be there.
05:23
The intent behind setup.config
was to replace setup.py
in most situations, meaning you could do an install without requiring an executable script to run, which is much safer. And finally, this is where the course is going to be focused.
05:38
The most recent incarnation of the Python packaging world is pyproject.toml
instead of, or in addition to, setup.config
. This file describes the metainformation about the package, and all the modern packaging tools use it to know how to build and publish the package.
05:55 The rest of the stuff is more packaging info, like some documentation and the license. I’ll cover the rest of these files in detail later in the course.
06:08
Okay, so that’s what a package looks like. By default, when you pip install
something, it’s going to try to fetch that something from PyPI. If you want to publish something yourself, that means you need a PyPI account.
06:20 You can get one by going to the registration URL and filling in the account creation form. You will also have to promise I’m not a robot, but I’m okay if you are. I don’t judge. With an account in place, you’re going to need to pick a name for your package.
06:36 A package name and module don’t have to match, but quite frankly, it’s better if they do. It’s less confusing to your users. The problem is package names have to be unique across PyPI, so you may find it challenging to come up with a name that hasn’t been used before.
06:52
The common solution to this is to apply a prefix to the module name, like sticking realpython
in front of reader
, for example.
07:01 PyPI has a sister site called TestPyPI, which runs the same code but isn’t used to install packages. This means you can publish and play around in there to your heart’s content without corrupting the main listing. TestPyPI needs a separate account, but the registration process is the same.
07:19 When you’re using the various publishing tools, you can specify on the command line which repo you’re publishing to, allowing you to create as many versions as you like in TestPyPI and then publishing to the real place when you’re ready.
07:31
realpython-reader
is actually on PyPI, so the name is taken, which means if you want to follow along completely through to publishing, you’re going to need to go to TestPyPI, and you’re also going to need to pick a different package name. When you get there, I’ll warn you just before that step.
07:51
You’ve seen the development directory of the reader
package, but what actually gets uploaded? Well, it’s based on that src/
directory, and it typically contains two things: a wheel and a source distribution.
08:03 The wheel is named after a cheese wheel, heaven’s perfect gift. Yep. Those folks at the package authority picked a theme, and they stuck with it. No worries, it’s all gouda. Hey, I promised no singing.
08:17
I didn’t promise to avoid bad cheese puns. If you’ve been around Python for a bit, you might remember the egg file that came out of distutils
. That’s been replaced by the wheel.
08:26 The wheel format includes the ability to package precompiled platform-specific binaries. The details of that is beyond the scope of this course, but the short version is the wheel is actually a superior package.
08:38
Wheel and source distribution packages get created by a build system and now is mostly done through setuptools
, but even that is abstracted.
08:46
You can also build using other build systems. In fact, one of the bits of metadata you put in pyproject.toml
is just what build system you want to use. Most of this course concentrates on setuptools
, but the lesson before the last will briefly introduce you to a couple others.
09:05
I mentioned that the setup.py
file when I was showing you the contents of the package. It’s the older way of installing a package and would be run after the package was downloaded. It’s actually a Python script, which means installing a package requires running code. This is problematic.
09:22
What if you want to scan the package before running it? Say, for safety. This is why the Python packaging world is moving away from setup.py
.
09:31
But it is still found in some situations and is needed for more complex packages, especially those that have C extensions and a compile step. Way back in the overview, I got persnickety about the version of pip
you’d use.
09:46
The ability to do an editable package install with only a pyproject.toml
file is a relatively recent addition. Not that long ago, you needed a setup.py
shim to do this.
09:58
Use the latest version of pip
, and it isn’t a problem.
10:02
A big part of the original setup.py
script was a dictionary containing all of the package’s metadata information. The setup.config
file is a text-based version of that same dictionary content, and the pyproject.toml
file is a TOML equivalent.
10:19
Most of the build and test tools out there are now using the pyproject.toml
file. If you’re not doing anything too complex, you will be able to stick with just that.
10:28 Have I mentioned that this stuff is complicated? Yep. Almost makes me think you need another cheese pun to relieve the tension. No, I won’t do that. It’d be Muensterous.
10:41
And this is what you’ve been waiting for: the pyproject.toml
file for the reader
package. Square brackets ([]
) like this in TOML denote a section.
10:50
The build-system
section indicates what build system to use. The requires
parameter inside the section specifies the version of setuptools
I’ll be using and the name of the build system back end.
11:04
The project
section has the metadata for the package, starting with what the package is named and then the current version number. This is a short description of what the package does, and the readme
attribute points at the README file inside of the package.
11:19
The authors
dictionary contains the name and email address of the package authors, and the license attribute also points to a file, this time the "LICENSE"
file. Note the use of a dictionary here.
11:30 There are a couple of different formats for this attribute. This style is the one that points to the file in the package.
11:38
Next is the array of classifiers. Classifiers are descriptors for the package. A complete list of them is available at the packaging authority site. The classifiers here state that reader
is MIT licensed and based on Python 3. Some kinds of classifiers indicate the purpose of the package as well.
11:57 Things like web framework gives your users an idea of what is there. These identifiers must be from the standard list. It’s done this way so the content can be understood and categorized by machines.
12:10
The keywords
attribute is kind of like the classifiers, but this is free-form. You put whatever words here make sense, and these become terms that are included in the search index for your package.
12:22
The dependencies
array is used to tell the packaging system what other packages your package depends on. I’ll dive deeper on this later. For now, just get that it’s a list of things reader
needs and can include version information about those dependencies as well. When you pip install realpython-reader
, pip
will also install the dependencies: the feedparser
, html2text
, and tomli
libraries.
12:46
And this last bit that I’m going to talk about is the requires-python
attribute that indicates a minimum version of Python needed for the package.
12:54 All of this is actually only about half the file. I’ll cover the rest a little later. The bits here that I’ve shown you are more than enough to get you publishing though.
13:07 Once you’ve published your package, all that metadata shows up on the package’s homepage on PyPI. Here’s the full name, the description (this entire chunk is from the README file),
13:21 the author info, the license,
13:25 the Python version requirement, and the classifiers. A little bit of configuration information in your package goes a long way. PyPI takes all this information and displays it for your users and also uses it to make your package show up in the site searches.
13:45 Now that you’ve got a package, you can publish it. I’ll cover that next.
Become a Member to join the conversation.