Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

Accessing Groups

00:00 In the previous lesson, I showed you the re module of Python, the module that exposes the power of regular expressions inside of Python code.

00:09 In this lesson, I’ll show you how to get at those groups defined in your regex. You’ve already seen some of the attributes and methods on the re.Match object. There are other ones, though, around grouping. The .group() function takes a list of arguments.

00:26 It returns a tuple of matched groups for each of the numbers you give it. So if you call .group(1, 3), it’ll return the first and third match in a tuple. .groups() plural returns all of the matched groups.

00:42 You’ve seen the .start() and .end() methods showing the beginning and end of matches in the string. The .expand() method takes a string template and substitutes any backreferences inside of the template with their actual results.

00:57 This is kind of like the idea of an f-string, but with the backreferences from your regex. Finally, there’s also the .span() method, which you’ve seen before as well, which contains the start and end values in a tuple.

01:12 Starting out by importing the library. Let’s do a search with a group match.

01:21 This regex is looking for a group consisting of one or more word meta-characters followed by a comma, followed by another group with one or more word characters.

01:33 The .groups() function shows the matching groups. The first group is the word 'one', and the second group is the word 'two'.

01:46 You can see an individual group by passing a number to the .group() function.

01:53 You can also pass in multiple.

01:58 If you pass in more than one, you get back a tuple in the order that you put it in. You may notice that these are all one-referenced. This is to be consistent with the concept of a backreference.

02:11 .group(0) is the entire match. Because this regular expression is looking for some number of meta-characters, a comma, and then some number of meta-characters, the 'one' and the 'two' match, but the 'three' does not because there’s only one instance.

02:26 You can also use indexes on the Match object to get the same content. Once again, this is the backreference \1. 0 is the entire match, just like

02:41 .expand() gives you a template to be able to expand the backreferences. This takes the results of the match and then inserts the backreferences that you reference inside of the template into the string.

02:56 This is kind of like an f-string for your regular expression matches. You’ve seen the .start(), .end(), and .span() functions before, but for completeness’s sake, here they are again.

03:11 There’s a plain text .search(). There’s the Match. Starting value of 4, ending value of 7, and the span of 4 to 7. Just as a reminder, these are zero-indexed, the same as a slice in a list or a string.

03:35 Next up, I’ll show you how to name groups so that you don’t have to just use numbers.

raulfz on Jan. 28, 2021

Thank you for this tutorial,

I would like to know which method would give as result the two matches, that is:(one,two) and (two,three), I mean recursive matching.

I haven’t read the docs yet, but is there a flag or meta-character in python to alternate between greedy and not greedy matches?

Thank you.

Bartosz Zaczyński RP Team on Jan. 29, 2021

@raulfz Apparently, the built-in re module in Python standard library doesn’t support recursive matching. You can try out a third-party module such as regex, for example.

By default, quantifiers are greedy, but you can make them lazy by appending the question mark (?) meta-character.

raulfz on Jan. 29, 2021

Thank you @Bartosz Zaczyński, keep it up with these excellent tutorials!

Best, R.

Become a Member to join the conversation.