Advanced CSV Reader Parameters
In this lesson, you’ll see different ways to handle non-standard CSV files and how to deal with delimeters appearing in your data.
You learned how to:
- Change reader parameters
- Use different delimeters
- Quote fields containing special characters using
quotechar
- Escape the delimiter using
escapechar
00:00 Now you can open standard CSV files with Python, but what about the nonstandard cases? The default delimiter is a comma to separate items in a row, but what if you wanted to store an address that contained a comma?
00:12
Let’s take a look at an example. Here, I’ve got a CSV file that has a name
, address
, and date joined
for a number of users.
00:20
If you notice, the address has a comma in here, so if you were to pass this into the csv.reader
class, you would end up with this being split four times, because it would split at this comma also. Fortunately, we’ve got a couple of ways around this.
00:34 The first thing you can do is use a different delimiter in your CSV files. So looking at this example, I’ve just replaced all the commas with the pipe operator. This should work, so let’s go back to the CSV Python script that we’ve got and make the edits that are needed.
00:50
So now, it’s going to open 'different_delim.csv'
as a CSV file, and then the main change here is just change this delimiter
character from a comma to that guy. Alrighty, and just to make sure this makes sense, let’s say the "name"
lives at—instead of "department"
, now this will be "address"
—
01:20
I think this’ll just be "date joined"
. All right, let’s save this, open up a terminal. There we go! john smith lives at
—this address.
01:32 It kept the comma in there, and everything lines up as it should. Awesome! Another thing you can do is wrap the data in quotes. So in this example, you can see that the delimiter is still the comma, but now this address is inside quotes.
01:47
You can then set up the CSV reader to use these quotes and ignore anything inside them. So, let’s go back, change the delimiter here back to a comma, and now add this quotechar
, which I put an apostrophe—or, a quote—instead of the comma.
02:09
And set this quotechar
equal to a single double quote, because this is wrapped in double quotes. All right, save this, try to run it. Ah, what did I do here?
02:24 And of course, the other thing you want to change is the name of the file. Since I’m using a different file here,
02:30 it didn’t see any of those quotes or the delimiter. All right. Let’s try to save that, rerun it. And look—there we go. Everything works. Finally, sometimes you’re not able to change the delimiter or wrap your data in different quote characters and you need to escape a delimiter character, so let’s take a look at this final example here where you can see the delimiter is still commas, there’s no quotes, and I just put a pipe operator in front of the comma.
02:57
You can then tell the CSV reader to ignore any delimiter character that appears after one of these escape characters. So, going back, get rid of this quotechar
, and change it now to escapechar
, and just put that pipe operator in there. This time, I’m going to actually change the filename to the right one.
03:22 Alrighty. Try to run this, and everything works! Awesome! So now you know three different ways to handle nonstandard CSV files and try to deal with your delimiters appearing in your data. Depending on how complex your data is, you can use all three of these techniques at the same time so that you’re able to store just about anything you need.
03:42 That’s enough for reading CSV files—now it’s time to start writing them with Python. I’ll see you in the next video.
Joe Tatusko RP Team on July 30, 2019
Are you trying to read or write the file? You can use the escapechar parameter to put something before the comma, but this changes the data like you said.
Do you have an example row or cell that you’re trying to read in?
Eric P on Aug. 16, 2019
quotechar
and escapechar
are both very useful. Thanks
rcruzalegui on March 10, 2020
How would I use the same script but load it to a dictionary to it can be used for later?
shiv on July 3, 2020
i was wondering which IDE is best to code Python for Datascience coding?any experts here?
sweir12525 on Aug. 20, 2020
It would be very helpful if you let us beginners know what interpreter you are using. I don’t know how to bring files into Thorny or PYCharms in the same manner as you are doing in the demos. I have to make all my changes in my c:drive, save them to different file names and then change the coding. It is far more tedious than what you are using.
naomifos on March 30, 2021
What would you do in a case where your parameter is just a CSV reader object itself, and you want to read a CSV file? For the problem I have, it doesn’t want me to use the with...as
statement, so I am kind of confused about this. It wants us to use the parameter to read this file.
acamposla on Dec. 16, 2021
I can’t find different_delim.csv in the “Supporting material”
Martin Breuss RP Team on Dec. 17, 2021
@acamposla thanks for flagging this, I’ve added to file to the Supporting Materials download. If you go ahead and download the ZIP file again, you’ll have different_delim.csv
included as well.
wyper on April 28, 2023
I’m having a file with two characters “^;” as delimiter. How can we handle such a case with the csv module? Is it possible?
Become a Member to join the conversation.
avermaisi11 on July 25, 2019
What if i’m not allowed to change the data inside the file. how can i escape the comma char in such cases.