Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Set Up and Inspect the Data

00:00 You’ve learned a lot about substring identification using Python so far, and in this last part of the course, we’re going to address yet another pretty common use case where you might want to find a substring inside of a pandas DataFrame column. So if you want to follow along with the examples I’m going to show here in the course, then make sure that you download the source materials, which includes a CSV file that has a bit of information that we’re working with.

00:29 You can find the material in the dropdown called Supporting Material underneath the video. And then you download the Sample Code ZIP file. You are also going to need to create and activate a virtual environment and install pandas. Once you’re set up with these, then you can start a Python interpreter and import pandas as pd, and you’ll work with a method on a pandas Series object that allows you to perform this substring check.

01:07 Here I’m in a Python environment where I have pandas installed, so I can go ahead and say import pandas as pd.

01:16 And I also have access to the companies.csv file. I have it in the same directory as I started this interpreter from, so I’m going to load it here by saying companies = pd.read_csv(), and then I give it the name, which is just "companies.csv".

01:37 And like I said before, you can get this file if you download the materials for this course. Okay, let’s take a quick look. companies.shape, so it’s got a thousand rows and two columns.

01:52 And let’s take a look at it as well.

01:57 So you’ve got one column that has a company name, and then another column that is a slogan for the company. And you want to do some search on this slogan column.

Avatar image for mindconnect dot cc

mindconnect dot cc on April 5, 2023

Where can I find this “companies.csv”?

Avatar image for Bartosz Zaczyński

Bartosz Zaczyński RP Team on April 5, 2023

@mindconnect dot cc You’ll find it in the supporting materials of the associated written tutorial. To download them, navigate to the mentioned tutorial and find the link labled “Click here to download sample code.”

Avatar image for Martin Breuss

Martin Breuss RP Team on April 5, 2023

@mindconnect dot cc thanks for the heads-up! Additionally to what @bartosz mentioned, you can also get now directly here in the course.

I’ve updated the code in the Supporting Material dropdown. When you download the Sample Code (ZIP) again, then you’ll also get the companies.csv file.

Avatar image for mindconnect dot cc

mindconnect dot cc on April 6, 2023

Wow lightening fast, thanks so much!

Avatar image for Martin Breuss

Martin Breuss RP Team on April 6, 2023

:) You’re welcome! Thanks for pointing this out!

Avatar image for ajackson54

ajackson54 on Dec. 20, 2024

I’m having trouble reading companies.csv. I downloaded the sample code into my venv folder. I’ve been using the command prompt to create my virtual environment and I am using it for my python code. I had no problem importing pandas – I checked my folder(‘pandas_substring’), and it’s in the site-packages folder. With the sample code, however, I received a ‘file not found’ error. Also, I have a minor problem. When I was taking the pip course and creating a virtual environment, I was using Windows Powershell. I had no success using it so I tried it in Command Prompt. It worked but I am curious why I didn’t succeed in Powershell.

Avatar image for Martin Breuss

Martin Breuss RP Team on Jan. 7, 2025

@ajackson54 let’s see, you’re mentioning a couple of different issues here. I’ll try to address what I can.

Location of CSV file

You shouldn’t place the CSV file in your venv/ folder. In fact, you generally shouldn’t touch the venv/ folder at all. You just create it, activate the environment, then forget about it. Python will use it to install packages, but you shouldn’t put anything in there manually.

Instead, you should place any file your script needs to interact with at the same level as your script. That’s the simplest way to allow Python to find it, because then you can just write the name of the file instead of having to deal with paths in a more complex way.

Using PowerShell to create a venv

It should work to create your virtual environment with PowerShell using the same command as on Command Prompt:

PS> py -m venv venv\

This assumes that you installed Python using the official installer, which also installs the py shortcut. Otherwise, you may have to use python instead of py.

Take a look at the following resources for more guidance and context:

Hope this helps, otherwise please let me know about the problems you ran into in more detail.

Become a Member to join the conversation.