For a deeper dive into directory listing, check out How to Get a List of All Files in a Directory With Python.
Getting a Directory Listing
00:11 And this is really important because pretty much any task that you want to do, in Python or any other programming language where you’re working with files, you’re going to have something in there where you want to look through all of the files in your current directory and do something with them.
00:52 It will be more of a question of whether you prefer a more traditional, kind of, Bash-style syntax for working with your filesystem, whether you prefer to work more with bare strings and things like that, or whether you prefer to work with a more object-oriented library.
In the minds of most people that I know, a more object-oriented framework like the
pathlib module is better Python practice, but if you are more comfortable with the
os module then, by all means, use it.
So I’ll show you and I’ll try to elaborate on the distinctions between these two. So with all that said, the first function here is
os.listdir(), which just takes in a directory name and then returns a list of all the files and subdirectories in that directory, as a string.
The slightly more sophisticated alternative is
os.scandir(), which returns an iterator instead of a list. That iterator is not of strings, but rather of file objects that have some properties to them, which make them a little easier to work with when you actually want to treat them as files.
And then the
pathlib module has a very similar paradigm to
scandir(), except it works on a
Path object, and I’ll show you how to initialize a
Path object and work with it over in the Python REPL in just a second.
02:03 My sample directory for this lesson is relatively simple. It just has a couple of subdirectories and several files in it, and each subdirectory has a couple of different files, too, just so that you can see the differences between the directory listing outputs for all of these directories.
Okay, let’s get into the code a little bit here. The first thing to do is get some imports going:
import os, and then I’m also going to say
from pathlib import Path, just so I have all my imports up here in one place.
So the first thing that you can do, if you want to get a directory listing, is you can just say
os.listdir(). And then, as you can see from my handy REPL, it takes in an optional
If you don’t pass in a
path it will just use the current directory. And as you can see, if you remember from my slide with the sample directory information, it has
'sub_dir_b', and then three different files, all of different types.
You can say something like, maybe,
os.path.isfile() and then you could call that on a file, or whatever, but you have to do all of this circumlocution to figure out some information about this file.
DirEntrys have information within them, whereas the strings you have to do a little bit more work as a programmer to learn the necessary commands to actually get this information about these things.
So, that’s something to notice there, is that
os.scandir() is a little bit more object-oriented, whereas
os.listdir() works a little bit more like a traditional Linux filesystem, where pretty much everything is a string, and you just have to do the work on your own to figure out the information about it.
pathlib takes a similar philosophy, except you have to create the
Path as an explicit object. So
dir_path = Path(), and then I’ll just pass in the current directory,
"./", and then you can do exactly the same stuff as you can do with
os.scandir(), it’s just that you call
iterdir() directly on the
os.listdir() is really simple and easy,
pathlib.Path() have a little bit more nuance, and you can do a little bit more things with them, but you don’t necessarily have the same kind of ease that you do with
Become a Member to join the conversation.