Get All Matches as Match Objects
00:00
In this lesson, you’ll take a look how you can find all occurrences of a substring using regular expressions, and then also preserve a lot of information about them by yielding Match
objects. And to do this, you can use re.finditer()
and then pass it again your pattern, and then also where you’re searching.
00:21
So the pattern that you used before was a match group around "secret"
, and then you wanted to look for the two occurrences of "secret"
that are followed by either a dot or a comma. Okay, that’s your pattern, and then you want to search inside of text_lower
.
00:44 So this actually collects all of them in an iterator. So if you run this, then you can see that you get back a callable iterator, which right now this doesn’t help a lot if you don’t loop over it or if you don’t save it anywhere. So let’s go ahead and do that.
00:59
I’m going to say for match in
and then this iterator.
01:07
And then let’s just start by printing out match
.
01:13
Okay. You can see, again, you’re getting these useful Match
objects where you have a ton of information in there, like for example, where do the substrings start and where do they end?
01:23
And then also the actual match that you got. So that’s pretty neat. And again, you can use methods on these Match
objects such as .group()
or .span()
that you used before and just work more with information of those substrings that you identified. I’m going to give you an example.
01:41
Instead of just printing the match, let’s say group(1)
, and that just means you want to get the first match group, which in this case is "secret"
, and you don’t need to pass 1
in here.
01:54
You could just use .group()
without an argument, like you did in a former lesson. If you don’t pass an argument, then you get the same as when using .group()
and passing zero.
02:03
It gives you back the complete match string. In this case, that would be "secret."
with the dot and "secret,"
with the comma. But if you’ve used capturing groups in a regular expression pattern, then you can also access these capturing groups by index, and here you’re fetching the first group.
02:22
And let’s also print match.span()
to repeat the same methods that you tried out before. And you’ll see—or actually let’s put them in one line.
02:44
So you’ll have the information of the captured "secret"
and also its location, and like before, you could also create two match groups.
02:56
And then we can say match group 1
, a secret, it’s at this location. To mix it up a bit, I’m going to do the match group 2
over here, and then you’ll see that it also prints out which one it was.
03:10
So this was the "secret"
with the dot after, and the "secret"
at that location is the one with the comma after. And of course, yeah, you can work much more with those Match
objects because it actually preserves a lot of information about the substring that you mentioned in the string.
Become a Member to join the conversation.