00:00 In the previous lesson, I showed you how to use anchors to change where the matches happen inside of a string. In this lesson, I’m going to be talking about quantifiers: how to do repetition inside of your pattern matches.
Now to the other end—dollar sign (
$) matches the end. Like the
^ in multiline mode,
$ matches the end of a line.
\Z is the equivalent of
\A, matching the end of the string even in multiline mode.
In addition to the beginning and end of strings, you can match word boundaries.
\b is the word boundary anchor. Literal
'car' but not
'car' ends on a word boundary.
'3990'. Changing the
+ to a
*, and the number of matches changes. This says
9 zero or more times, and then the number
0. Changing the
* to a
?, now you’re looking for zero or one matches.
Quantifiers can also apply to meta-characters. This regular expression is the literal
S/N: followed by the digit meta-character (
\d), one or more times, then a hyphen (
-), and then a word character (
When dealing with quantifiers, you need to understand the concept of greediness. This is
<, some character zero or more times, and then
>. This is creating one match, starting with
<, some number of characters, and then
>. The two
> characters buried inside of it are eaten by the quantifier because it’s in greedy mode.
It will consume as many characters as it can inside of the regular expression. You can modify the quantifier by changing it to non-greedy mode. Adding a
? after the quantifier changes it to be non-greedy.
Now you’re seeing three different matches, each starting and ending with angle brackets. It’s unfortunate that they chose
? as the way of turning something from greedy to non-greedy.
? on its own is also a quantifier.
Because it’s in greedy mode, it absorbs all of the letters. Adding the
?, and it takes the minimum number to make this expression valid. Because the
+ means one or more, the minimum is one, so now just capital
'A' and small
? to non-greedy mode, and notice the difference in
'Aaaaaaaaaah'. Now only the capital
'A' is matching. The least greedy version of zero or one characters is zero characters, so only the capital
'A' is highlighted.
The other kind of quantifier indicates the number of repetitions. This is done with curly brackets. This is looking for whitespace meta-character (
\s), a digit (
\d), and the digit being repeated
This regular expression now looks for a space followed by anywhere from
7 digits. This matches
' 1949', as well as the serial number before. A range like this is inclusive—the
7 isn’t the upper limit, but the largest number of values.
The first value also doesn’t have to be specified. This says “From zero up to and including
5 matches.” The whitespace all the way through the text is being highlighted because whitespace followed by zero digits is matching this expression.
07:59 If you’re looking to match actual curly brackets, don’t put in a number. This will match literal curly brackets. Because there’s no number and no comma inside of it, it treats them as if they are normal characters.
Become a Member to join the conversation.