Structuring a Python Application (Overview)

Structuring a Python Application Christopher Trudeau 12:25

Python, though opinionated on syntax and style, is surprisingly flexible when it comes to structuring your applications. On the one hand, this flexibility is great: it allows different use cases to use structures that are necessary for those use cases. On the other hand, though, it can be very confusing to the new developer. The Internet isn’t a lot of help either. There are as many opinions as there are Python blogs!

In this course, you’ll:

Walk through a dependable Python application layout reference guide that you can refer to for most use cases
See examples of common Python application structures, including command-line applications (CLI apps), one-off scripts, installable packages, and web application layouts with popular frameworks like Flask and Django

Download

Sample Code (.zip)

29.8 KB

Download

Course Slides (.pdf)

1.1 MB

00:00 Welcome to Python Application Layouts. My name is Chris and I will be your guide. This series of videos will show you how to organize your Python code and the files associated therein. Each lesson in the video will cover different kinds of configurations: simple one-off scripts, installable single packages, larger applications, a Django web application, and finally, a Flask application. First off, I’m going to assume that you have some familiarity with Python modules and packages. If you need a refresher, there’s a course on Real Python that can help you with the details.

00:32 So, let’s get started. Python is very flexible when it comes to how you structure your applications. You can more or less get away with almost anything you want.

00:39 That’s very powerful, but it can also be very daunting if you’re just getting going. Unfortunately, the internet is no help. If you start googling for how to do this, there’s lots and lots of opinions on how it should be done. That being said, the content in this video is based on a Real Python article from Kyle.

00:56 You can see the original article here. This means that it in itself is also opinionated. Now, I happen to agree with Kyle’s opinions and he’s outlining things that are best practices, but you will be able to find contradictory information out there.

01:10 The intent of this course is to give you good places to start. Once you’re comfortable, you can adjust things to your own style. Accompanying this video is a selection of sample code.

01:19 It’s actually executable. It doesn’t just show the file layout; it actually works. This means you can package it, run the tests, and see how things change from a small application to a large application. Hopefully, this is helpful as you’re following along.

01:35 I’ll start with the simplest situation: a program consisting of a single Python file. This kind of situation works if you don’t have any dependencies and everything’s just pure Python or if you’re using a virtual env with tools like pip or Pipenv to manage external dependencies. Of course, single-file programs often aren’t just single files. You’re writing automated tests, right?

01:57 You should be. Even if you’re not, there’s also things that go along with it—for example, configuration for your git setup, setup.py for organizing how your system is packaged if you are trying to do packaging, requirements.txt that specifies your dependencies, and licensing and README information is always useful if you’re trying to hand this script off to somebody else.

02:19 This course comes with sample code that actually executes. I’m going to give you a quick tour of what it does. It’s a little Hello World so that you can see as I build on it in the different lessons how the pieces fit together, and what changes as things become more complicated.

02:33 So, here it is: helloworld.py. Normally, a Hello World example is nothing but print("Hello, World!"). In order to show you how to interact with external libraries and how these pieces fit together, my Hello World is a little bit more complicated, but not by much.

02:48 All it’s doing in line 12 is using the very popular requests library. If you haven’t seen it before, requests wraps HTTP code.

02:57 It’s the easiest way of grabbing the contents of a web page. Line 8 shows the URL that I’m grabbing. The URL I’m grabbing is a Wikipedia article on Hello World programs—how very meta. And then on line 13, I’m printing it—finding the <title> tag inside of it and printing out just the result of that.

03:17 So, as I said, pretty simple—essentially, it just prints "Hello, World!" but uses the requests library to do it.

03:25 The other thing to notice here is I’ve embedded a version number.

03:29 There are multiple ways of specifying version numbers; this is my preferred mechanism because you can actually get at it programmatically. Later on, when this goes from a single file to multiple files, I’ll move this out of the script and into the __init__.

03:43 And here it is in action.

03:46 That’s the <title> tag from the Wikipedia page on Hello World programs, as expected.

03:54 So, this is some test code to go with my Hello World program. Interestingly enough, there’s actually more code in the tests than there is in the program itself.

04:01 That’s because I’m avoiding the call to the internet inside of requests. The meat of the test is lines 15 through 19. Line 18 calls the actual code. requests is mocked out so that the internet call isn’t actually made. I generally try to make a difference between my unit tests, which are self-contained and don’t require the dependencies, versus my integration tests, which would include going off to the world.

04:28 This allows me to write much faster unit tests, and I’m not dependent on the environment I’m running them in. If you haven’t seen unit tests or mocks before, there’s plenty of good content on Real Python to help you out with that. For now, just understand that this is the code that tests it, and I’ll show you quickly how that works.

04:46 The Python unittest module is executable. On the command line, I can specify -m to execute unittest, passing in my tests file. Python’s unittest library will then execute the tests in my tests.py file.

05:01 Here it goes. You see the results of the program: "Hello, World!" program - Wikipedia, because do_hello() prints something to the screen, and then you see the output from the unittest module specifying that it ran one test successfully and everything’s okay.

05:17 So, you’ve seen the code and the tests that go with it. Now, I want to show you the accompanying files that would probably also be in the directory, particularly if you’re going to package this up for other people. For starters, .gitignore. This is a configuration file for git to tell it to ignore certain file types.

05:32 It stops it from trying to check in things like backup files and swap files that you’re not interested in. Here’s a really, really simple example. If you’re writing code on a Mac, the .DS_Store directory shows up everywhere.

05:44 It’s where meta information is stored. You probably don’t want that checked in. If you’re writing Python 3, __pycache__/ will have the contents of the compiled Python files. Any Python c, Python o, or Python d files—so that’s .pyc, .pyo, or .pyd—are all object files and don’t need to be in your repository.

06:05 These are examples of things you probably don’t want to store inside of git. A complete version of this file is much longer than this. You can see sample .gitignore files off at GitHub itself.

06:15 It has good .gitignore files for almost every programming language you can think of. My Hello World program uses the requests library.

06:23 That means I need to be able to pip install it to a virtual env. The requirements.txt file shows this dependency as well as anything else that I might use when I’m developing. A sample file includes requests 2.22, which is the library I was using, as well as coverage and pudb—coverage being a program to help me see what my tests are covering and aren’t covering, and pudb is my favorite command-line debugger. requirements.txt generally stores everything you need to run your tests and develop.

06:57 This is often more than what you would include in a shipped application. So, setup.py won’t have the coverage and pudb items in it—it’ll only have requests. Just the strict dependencies. Speaking of setup.py—what is it? It’s a way of specifying what is packaged in your application.

07:16 Unfortunately, Python isn’t really strong when it comes to packaging. Or an alternative way of looking at that is there’s lots and lots of choice; it isn’t really baked into the language and there’s tons of third-party ways of doing it. The Python community is trying to tackle this, but it’s changed a lot over the years and so you’re going to bump into all sorts of ways of doing things.

07:37 setup.py isn’t the only way, but it is one of the more common popular ways, and a lot of tools interact with it. For example, if you’ve heard of tox, it’s a test harness for testing your program under multiple versions of Python.

07:49 twine is a tool for uploading a package to PyPI so you can share it with other people. The packages that twine uploads are the result of setup.py being executed and creating bundles. For more information on setup.py and how to use it with Pipenv, go to the Real Python article on this topic.

08:07 Now, I’ll show you an example. This is the bottom chunk of a setup.py file. It’s kind of funny; not only were the tests twice as long as the code, now the configuration file is four times as long.

08:20 The heart of setup.py is down at the bottom here in these final four lines—calling the setup() library and passing in a dictionary that describes your package. I’ll move up a little bit now. Lines 16 through 38—this is the dictionary specifying information. Really, this is just attributes about the program helloworld.

08:40 It includes things like name, the version, description, a longer description, where to find it, who the author is, the license you’ve picked, what kind of program it is, what modules it uses, and what it requires.

08:54 35 through 37 is important—this is what makes sure dependencies get installed when your package is installed by someone else. Remember how this is different from requirements.txt: requirements.txt includes everything that you would put in your development virtual environment—and that includes things like your debugging tools—whereas install_requires is only what someone using your program would need.

09:18 So. install_requires—the requests library; requirements.txt only has requests as well as things like coverage and pudb inside of it. I’m going to scroll up to the top now.

09:30 Lines 1 through 13 aren’t required for setup.py to work; as long as you fill in your dictionary args, you’re good to go. These 13 lines are things that I often do inside of my setup.py so that I’m reusing information from the outside world. For example, I have a regular expression looking for "__version__" inside of the program file. It reads it and puts it inside of the dictionary.

09:53 This means I don’t have to remember to update the version number in two places: the code and the setup.py file. Similarly, because I have a README file, I’m reading that in and putting it inside of the long_description. You can of course just set these things up inside of the dictionary—or in some cases, not even include them—but I wanted to show you what I often include inside of my own application configuration. Now, you’ve seen requirements.txt and setup.py.

10:18 Just a couple of housekeeping files left. First off, the LICENSE. There’s a lot of misinformation out there about copyrights, particularly when it comes to software.

10:28 The thing you have to know is if you do not include a license file, the code is fully copyrighted by default, in most jurisdictions. What that means is no one has any right to use it or copy it in any fashion.

10:42 The open-source world would not exist if everyone did this. Licenses are what grant other people the ability to use your software, and you’re probably using an awful lot of opensource products right now, so in my view, it’s only fair to share some of what you’re using yourself.

10:59 There’s plenty of variations on licenses. You can pick which one you like. If you’re using GitHub, when you create a new repository it will even give you a dropdown to choose which license to include in your repository. That massive text there is the MIT License.

11:14 It’s one of the simpler licenses and it’s very, very popular in the Python world. This is the one I tend to go to myself unless I’m doing something that’s interacting with other projects that don’t allow it.

11:26 There’s lots of good information off at ChooseALicense.com if you want to see what the differences are and help you pick a license. Just remember: if you don’t include a license you’re not actually sharing your code, regardless of whatever your intent was.

11:40 Finally, a README file is also important. If you’re sharing your code, you need to be able to tell people what it is. This should be a short description of your project.

11:48 Two very common formats for this are Markdown and reStructuredText. If you’re using GitHub, README.md or README.rst will automatically show as the project’s homepage.

12:00 Writing these things sometimes can be a bit of a challenge. Us programmers—not so goodly in English. If you’re having a little trouble and you want to figure out how to do it, Dan’s written a great article on how to describe your project and how to tell others about it. So, that’s a simple single-file program.

12:16 Next up, I’ll make things a little more complicated and show you how to use a single package which has multiple files in it.

drewmullen on April 29, 2020

What do you gain from including the version inside your helloworld.py file? ive only been maintaining in my setup.py and it has seemed to work OK for me.

Denis Roy on April 30, 2020

Here are clickable links of the references made in the video:

Modules & packages refresher:

realpython.com/python-modules-packages

Article from Kyle Stratis:

realpython.com/python-application-layouts

Lots of .gitignore files:

github.com/github/gitignore

RealPython article on Pipenv:

realpython.com/pipenv-guide

Choose a license:

choosealicense.com

Hints on writing a good README:

dbader.org/blog/write-a-great-readme-for-your-github-project

blackray on May 2, 2020

hi drewmullen, i think Chris is trying to convey that if you have the code itself manage the version, and your setup.py file reads it from there, you don’t have to worry about aligning them. (09:55)

plus, he has the readme file pointing to the same variable. So when his version got updated, readme file get updated too.

Hope that helps.

andersgs on May 3, 2020

hi drewmullen, in addition to blackray’s comment, I would add that it is considered good practice to keep your version string with the code and import it in to your setup.py because then you can access the version string programmatically (e.g., import pandas; print(pandas.version).

That said, I am not sure you need to use the regex expression. I would just have added an import line:

from helloworld import __version__ as version

print(version)

drewmullen on May 4, 2020

great point @andersgs. now that you mention it, i have actually ran into problems with apps where the version isnt available. it can be helpful if you use the code downstream and want to say “if version < 0.x behave one way or another”

Become a Member to join the conversation.