Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set your subtitle preferences in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please refer to our video player troubleshooting guide for assistance.

Use Regular Expressions With pandas

If you’d like to learn more about pandas, then check out:

00:00 Using .str.contains() on a pandas series, you filtered out the companies that had the word "secret" in their slogan. But the filter is very broad.

00:09 The company with ID 656 in your DataFrame has the word "secretly" in their slogan, and it also shows up. So one useful thing about this .str.contains() method in pandas is that you can pass a regular expressions match pattern as an argument to .str.contains().

00:29 So if you wanted to just find this one company that has the word "secretly" in there,

00:39 you could put a regular expression in here and say "secret", and then you want a word character (\w) and plus quantifier (+) so that it’s more than one. And if you press Enter here, you can see that the filter returns only the one single row that actually contains the word "secretly" in that case. Now this, because it’s a regular expression pattern, this could also match another word.

01:03 It could match "secrets", for example, with a plural, or anything that starts with "secret" and then has a couple of word characters following it.

01:15 So using .str.contains() in pandas is an effective and quick way that you can filter your DataFrame for only those rows that contain a certain substring in the values of the series that you’re checking on. And also keep in mind that it allows regular expressions, which makes the search even more powerful.

01:38 One last note. The way that I showed accessing a Series object from a DataFrame using the dot operator isn’t the most efficient way of doing this in pandas.

01:48 This is a quick way of showing it and useful for filtering, but if performance is what you’re looking for, then take a look at the optimized data access methods in pandas.

02:02 So if you want to learn more about pandas, then check out the resources that we have on the site. We have a lot of resources ranging from tutorials over video courses.

02:11 If you’re want to get more familiar with pandas, then it’s a good idea to go check these out. That’s it for this course. In the next lesson, we’re going to do a quick recap of everything that you’ve learned.

Become a Member to join the conversation.