Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Searching for Files Using Glob Methods

00:00 In this lesson, you’ll see how you can search for specific files in a directory. Say you don’t want to iterate over all the contents of a directory like you did before with using .iterdir(). Then you can use a method called .glob().

00:14 And this method returns an iterable of Path objects again, like .iterdir() does, but it only returns those that match a certain pattern.

00:23 Let’s take a look at this in IDLE.

00:26 So once again, here you have access to your notes directory, and you can iterate over the iterable that you receive when calling the .glob() method on it.

00:37 Let’s take a look how that works. for path in notes_dir.glob(): and you need to pass it a pattern in here. So, this is not like it’s .iterdir() where you can just call it like that.

00:50 Let’s give a try. What happens if you would try this? You can see that .glob() is missing one required positional argument called 'pattern'.

00:59 So you need to pass in a pattern as an argument to .glob() in order to use this method. for path in notes_dir.glob(). So what is a pattern? You can think of it like a filter.

01:13 So, you want to only search for specific files or folders, and you search for them according to a specific pattern. So I can use a wildcard character, and we’ll start off with using the * (star).

01:26 And this * wildcard character can represent any other character. So if I say "*.md"

01:35 and I need to pass this pattern as a string—.glob() is going to filter the contents of the directory for only files that end with .md, so only the ones that have the file extension .md are going to be shown. Now, if you remember the lesson before, then there should be one of those files in there, and the two folders that are also contained inside of the notes_dir should be filtered out. So if you now say print(path),

02:02 then you can see that it only displays README.md. And to confirm that this is again Path objects, I’ll just pass it to list() the iterator here

02:14 so that you can also see that you can do things with the objects that get returned from calling this method.

02:24 So, as you can see, if I pass the iterator that gets returned from .glob() to the list() function, you’ll see that it contains one item that is a Path object that points to README.md.

02:37 Now, you might wonder about why is this method called .glob()? It sounds a little like dot-blob, and it’s somewhat of a weird name.

02:45 And this has historical reasons because there was a program in Unix that did something similar, and if you want to know more, then it’s just one Internet search away.

02:56 But to recap this first view on .glob(), it returns an iterable of Path objects that match a pattern. And then you got to know the * character, which is a wildcard character that matches any number of other characters.

03:09 So anything before in the path gets matched. And if you say, for example, "*.md", then it will match all files ending with .md inside of whatever directory you’re searching in.

03:22 Now, there’s other wildcard characters, and let’s look at those in the next lesson.

Become a Member to join the conversation.