Reading and Writing CSV Files With pandas
For more information about concepts covered in this lesson, you can check out:
00:00
Reading and writing CSV files. A comma-separated values file is a plain text file with a .csv
extension that holds tabular data, and it’s one of the most popular file formats for storing large amounts of data.
00:16 Each row of the CSV file represents a single table row. By default, the values in the same row are separated with commas, but you could change the separator to a semicolon, tab, space, or any other character.
00:31
You can save your pandas DataFrame as a CSV file using the .to_csv()
method, as seen onscreen.
00:43
That’s it! You’ve created the file data.csv
in your current working directory, and you can see its contents onscreen now. This text file contains the data separated with commas.
00:56
The first column contains the row labels. In some cases, you’ll find them irrelevant. If you don’t want to keep them, then you can pass the argument index=False
to .to_csv()
as seen onscreen.
01:15
Once your data is saved in a CSV file, you’ll likely want to load and use it from time to time. You can do that using read_csv()
as seen onscreen.
01:28
In this case, the pandas read_csv()
function returns a new DataFrame with the data and labels from the file data.csv
, which was specified in the first argument.
01:40 This string can be any valid path, including URLs.
01:45
The parameter index_col
specifies the column from the CSV file that contains the row labels and is set using a zero-based column index. You should set the value of index_col
when the CSV file contains the row labels to avoid loading them as data.
02:01
You’ll learn more about using pandas with CSV files later on in the course, but you can also check out Reading and Writing CSV Files in Python to see how to handle CSV files with the built-in Python library csv
as well.
02:16 Next up, you’ll see how to read and write Excel files.
CR05BY on Aug. 22, 2021
I did some reading and found that I could use:
no_labels.index.name = 'COUNTRY_CODE'
and found success. I am guessing this is because the index of the Pandas dataframe is a series? Thanks very much for any help or corrections!
Michael Russo on May 21, 2022
What REPL are you using?
Michael Russo on May 21, 2022
Answering my own question: bpython-interpreter.org/downloads.html
alexanderwu on Aug. 1, 2024
I find it being called “row labels” confusing. They are column labels on a single row.
alexanderwu on Aug. 1, 2024
Ah, nevermind previous comment.
Become a Member to join the conversation.
CR05BY on Aug. 22, 2021
I’m trying to rename the unnamed index of a dataframe I’ve called “no_labels” using:
no_labels.rename(index={'Unnamed: 0':'new column name'}, inplace=True)
This produces no error, but does nothing, and the first column is in fact the (base) index, and remains unnamed. Am I doing something wrong?