Advanced CSV Reader Parameters
In this lesson, you’ll see different ways to handle non-standard CSV files and how to deal with delimeters appearing in your data.
You learned how to:
- Change reader parameters
- Use different delimeters
- Quote fields containing special characters using
- Escape the delimiter using
00:00 Now you can open standard CSV files with Python, but what about the nonstandard cases? The default delimiter is a comma to separate items in a row, but what if you wanted to store an address that contained a comma?
If you notice, the address has a comma in here, so if you were to pass this into the
csv.reader class, you would end up with this being split four times, because it would split at this comma also. Fortunately, we’ve got a couple of ways around this.
00:34 The first thing you can do is use a different delimiter in your CSV files. So looking at this example, I’ve just replaced all the commas with the pipe operator. This should work, so let’s go back to the CSV Python script that we’ve got and make the edits that are needed.
So now, it’s going to open
'different_delim.csv' as a CSV file, and then the main change here is just change this
delimiter character from a comma to that guy. Alrighty, and just to make sure this makes sense, let’s say the
"name" lives at—instead of
"department", now this will be
01:32 It kept the comma in there, and everything lines up as it should. Awesome! Another thing you can do is wrap the data in quotes. So in this example, you can see that the delimiter is still the comma, but now this address is inside quotes.
You can then set up the CSV reader to use these quotes and ignore anything inside them. So, let’s go back, change the delimiter here back to a comma, and now add this
quotechar, which I put an apostrophe—or, a quote—instead of the comma.
02:30 it didn’t see any of those quotes or the delimiter. All right. Let’s try to save that, rerun it. And look—there we go. Everything works. Finally, sometimes you’re not able to change the delimiter or wrap your data in different quote characters and you need to escape a delimiter character, so let’s take a look at this final example here where you can see the delimiter is still commas, there’s no quotes, and I just put a pipe operator in front of the comma.
You can then tell the CSV reader to ignore any delimiter character that appears after one of these escape characters. So, going back, get rid of this
quotechar, and change it now to
escapechar, and just put that pipe operator in there. This time, I’m going to actually change the filename to the right one.
03:22 Alrighty. Try to run this, and everything works! Awesome! So now you know three different ways to handle nonstandard CSV files and try to deal with your delimiters appearing in your data. Depending on how complex your data is, you can use all three of these techniques at the same time so that you’re able to store just about anything you need.
Become a Member to join the conversation.