A plain string match is the text you’re looking for. In the sentence
'I put the thing in the place',
'thing' is highlighted because the regex is
thing. Do note that it’s only those letters—not the surrounding whitespace.
Similarly for the pattern
"Well, there's spam egg sausage and spam,"
"that's not got much spam in it" has three matches—each of the instances of
'spam'. Character ranges, or classes, allow you to specify a match over a range of characters. A common use for this is to look for numbers. The square brackets indicate a range—in this case, two ranges, both
9. That matches the number
'42' in the text on the right-hand side. You can match with letters as well.
01:23 Regexes support something called a meta-character—for example, digits, whitespace, words, et cetera. Oftentimes, these are shortcuts which you would otherwise have to specify with a lengthy square bracket class.
So now, with the character class vowel, period, character class, you will see matches where it is a vowel, some other character, and then another vowel. If you look at the matches, this includes things like
'ase', et cetera. Because period (
.) is a special character, if you actually want to match one, you have to escape it.
Here I’m matching quote (
") followed by any character. If you look carefully, one of them is missing. This quote isn’t highlighted, and that’s because the
. does not actually match newline, and as the only character after this quote is a newline, it won’t match.
A common pattern that you would need to match in programming text is the small letters, capital letters, digits, and an underscore (
_). In most programming languages, these are the valid characters that can be inside of a variable name. Here, you can see the plain text match for the capital letter
E followed by that range. In the case of this email message, that matches the
\w is a meta-character, short for word. And in this context, word means a programming variable. This meta-character does the exact same thing as the previous expression, still matching
'Enclosed', but is far less to type. A common pattern with most of the meta-characters is for the capital version of them to be the inverse. Changing small
w to capital
W matches everything that isn’t a word. In this case, we have capital
'E' followed by whitespace,
'-', and whitespace are not alphabetic letters or digits or an underscore. Speaking of digits, you’ll remember this pattern from the previous lesson.
\s, short for space, matches space characters or whitespace inside of your text. These are things like space, tab, and newline. This particular tool doesn’t actually highlight the newline characters, which makes it a little hard to see, but if I look for a colon followed by whitespace, you’ll see what I mean.
Like before, capital is the inverse—everything but a whitespace. So one way of finding the four-letter words is whitespace (
\s), followed by four instances of non whitespace (
\S), followed by whitespace (
You can use meta-characters inside of a character class.
\d matches a digit, the square brackets give the option of matching either a hyphen or a
\s. Remember, if the hyphen is first inside of the character class, it’s a literal match—not a range.
So this is looking for a digit followed by either a hyphen or a whitespace. This matches things like the
'0' and the
'9' at the end of this line, because the carriage return is included, as well as the
'3' followed by the hyphen, and the
'3' followed by the whitespace, the newline.
Because meta-characters begin with a backslash, to look for a backslash you must escape it with another backslash. This is looking for a literal backslash followed by a digit between
9. It matches the
'\5' at the end of the model number.
So the first two backslashes are an escaped backslash, the third backslash is part of the meta-character
\d. This is looking for a literal backslash followed by a digit, and this once again matches the
'\5' at the end of the model number in the text. Next up, I’ll be talking about anchoring expressions: ways of making sure that the thing you’re finding is at the beginning or the end of a string.
Become a Member to join the conversation.