Getting a Directory Listing
For a deeper dive into directory listing, check out How to Get a List of All Files in a Directory With Python.
00:00 In this lesson, I’m going to cover how to get a directory listing: simply put, a list of all of the things that exist in your current directory or any directory you might choose.
00:11 And this is really important because pretty much any task that you want to do, in Python or any other programming language where you’re working with files, you’re going to have something in there where you want to look through all of the files in your current directory and do something with them.
00:25 So this kind of pattern is something that’s really important, and you see it often when you work with files. It’s something important to cover.
00:32 Here are some of the basic directory listing functions. Throughout this series, I’m going to often show you a couple of different ways of doing the tasks that I’m going to show you how to do.
00:43
Often that will be a division between the os
module and the pathlib
module. There really won’t be too much of a difference between how these work.
00:52 It will be more of a question of whether you prefer a more traditional, kind of, Bash-style syntax for working with your filesystem, whether you prefer to work more with bare strings and things like that, or whether you prefer to work with a more object-oriented library.
01:07
In the minds of most people that I know, a more object-oriented framework like the pathlib
module is better Python practice, but if you are more comfortable with the os
module then, by all means, use it.
01:18
So I’ll show you and I’ll try to elaborate on the distinctions between these two. So with all that said, the first function here is os.listdir()
, which just takes in a directory name and then returns a list of all the files and subdirectories in that directory, as a string.
01:33
The slightly more sophisticated alternative is os.scandir()
, which returns an iterator instead of a list. That iterator is not of strings, but rather of file objects that have some properties to them, which make them a little easier to work with when you actually want to treat them as files.
01:50
And then the pathlib
module has a very similar paradigm to scandir()
, except it works on a Path
object, and I’ll show you how to initialize a Path
object and work with it over in the Python REPL in just a second.
02:03 My sample directory for this lesson is relatively simple. It just has a couple of subdirectories and several files in it, and each subdirectory has a couple of different files, too, just so that you can see the differences between the directory listing outputs for all of these directories.
02:20
Okay, let’s get into the code a little bit here. The first thing to do is get some imports going: import os
, and then I’m also going to say from pathlib import Path
, just so I have all my imports up here in one place.
02:33
So the first thing that you can do, if you want to get a directory listing, is you can just say os.listdir()
. And then, as you can see from my handy REPL, it takes in an optional path
parameter.
02:44
If you don’t pass in a path
it will just use the current directory. And as you can see, if you remember from my slide with the sample directory information, it has 'sub_dir'
, 'sub_dir_b'
, and then three different files, all of different types.
02:58 But as you can see here, all of these are just bare strings, and so this is a string list. So this isn’t super flexible when you want to actually get information about this.
03:08
You can say something like, maybe, os.path.isfile()
and then you could call that on a file, or whatever, but you have to do all of this circumlocution to figure out some information about this file.
03:23
This isn’t the case with something like scandir()
. You can say something like for name in os.scandir()
, and then that will also take in an optional path
parameter.
03:36
You can say if name.is_file()
—and I’m not sure exactly why this has no underscore, and this does have an underscore. I wish I could tell you more about why that is, but who knows?
03:49
You can just get that attribute directly from the object because, as you can see, these objects are DirEntry
s rather than just plain strings.
03:59
So the DirEntry
s have information within them, whereas the strings you have to do a little bit more work as a programmer to learn the necessary commands to actually get this information about these things.
04:12
So, that’s something to notice there, is that os.scandir()
is a little bit more object-oriented, whereas os.listdir()
works a little bit more like a traditional Linux filesystem, where pretty much everything is a string, and you just have to do the work on your own to figure out the information about it.
04:29
So, pathlib
takes a similar philosophy, except you have to create the Path
as an explicit object. So dir_path = Path()
, and then I’ll just pass in the current directory, "./"
, and then you can do exactly the same stuff as you can do with os.scandir()
, it’s just that you call iterdir()
directly on the Path
.
04:51
Then it’s exactly the same thing here: if name.is_file():
print(name)
.
04:56
As you can see, this name
has a little bit of a different printing function, but this name
is still an object, which has a little bit more information than just a bare string.
05:06
So, that’s relatively useful, and all of these things you can use on any path that you like, it’s not just on the current directory. So I could say listdir("./sub_dir")
,
05:19
and I could get the two Python files that are in the subdirectory. I could say for name in os.scandir("sub_dir_b")
and I could do the same thing, and just get the one file in sub_dir_b/
.
05:34
Then you could do the same thing with Path
as well, you’ll just have to create a Path
object for that subdirectory, or whatever other directory you want.
05:42
So, that’s three different ways to get a directory listing with the os
module and the pathlib
module. They all have their advantages and disadvantages.
05:51
os.listdir()
is really simple and easy, os.scandir()
and pathlib.Path()
have a little bit more nuance, and you can do a little bit more things with them, but you don’t necessarily have the same kind of ease that you do with os.listdir()
.
06:06 So, all things to consider. In the next lesson, I’m going to cover how to actually get attributes of these file objects.
Charlie3 on Sept. 9, 2020
Video > transcript > links is an awesome idea and concept. I love it.
Become a Member to join the conversation.
B S K Karthik on Aug. 8, 2020
Thank you .Nice content.Can you please let me know what is the theme used in VS Code editor