Configure Options and Settings at Interpreter Startup
In this lesson you’ll learn how to configure pandas options and settings at the interpreter startup.
00:00 In this video on Pandas tricks, you’re going to learn how to configure Pandas options and settings at the interpreter startup. Pandas has a number of options and settings that you can change that will affect how DataFrames are printed and what errors and warnings you see.
00:15
If you find yourself changing options and settings often, it can be very helpful to set up a startup file that contains all these options every time you start your interpreter. In a new file, go ahead and import pandas
, and create a new function called start()
.
00:34
Pandas options use dot notation, so you can easily use a nested dictionary to set the correct values. First, let’s make a dictionary called options
, and in here, add an item for 'display'
, which will be another dictionary.
00:49
Go ahead and let’s set the 'max_columns'
to None
,
01:01
to 25
. The 'expand_frame_repr'
(expand frame representation), which will just affect how DataFrames wrap from page to page if they’re particularly large—so in this case, set that equal to False
. And set a 'max_rows'
to 14
.
01:20
And you can shorten items in each cell by setting a 'max_seq_items'
(max sequence items) to something like 50
. One thing that I find myself always trying to change after the fact is the 'precision'
, so you can set how floats and decimals will round.
01:36 And finally, just because you can, get rid of the dimensions that you see at the bottom of each DataFrame telling you how many rows and columns are in there. All right.
01:48
Now make another item called 'mode'
, and this’ll just contain 'chained_assignments'
, and set that equal to None
. All this controls are the warnings you see if you try to change the value of a copy of a DataFrame, as opposed to the DataFrame itself. All right, that’s it for our options
.
02:11
Now loop through those dictionaries, so for category, option in options.items():
and then inside those nested dictionaries, op, value in option.items():
.
02:32
Now you actually go ahead and set those options, so call pd.set_option()
, use some f-string formatting to put the f'{category}.{op}'
.
02:47
That’ll just equal the value
that’s contained. Cool! So, if __name__ == '__main__':
start()
. And you can get rid of that function name to clean up your namespace by just deleting start
like that.
03:10 All right! So, save this. And if all goes well, all of these options should be set to Pandas every time we open the interpreter. So let’s go ahead, open up a terminal.
03:24
I’m just going to bring this up. And you’ll have to do this specific for your OS. For Mac, you can just say export PYTHONSTARTUP
to set the environment variable.
03:37
And I named my script pandas_tricks
and it’s in this directory, so I can just put this in like that.
03:48
Now, open up the Python interpreter—and that’s not right, so let’s see. No such key(s)
[…] .chained_assignments
. This s
should not be here, so let me close that out, go up, and let’s just change that to 'chained_assignment'
, which is the correct key.
04:10 And let’s try that again. Now, a nice thing when you’re setting these on a Mac—the environment variables don’t persist between sessions, so I’ll have to set that again.
04:30
This can be helpful if you have a number of different startup files that you use for different projects. And now opening the interpreter, everything works! So import pandas as pd
, and just to prove that it worked, let’s do pd.get_option('display.max_rows')
, which you set to 14
.
04:53
So, there you go. And if you want to see some actual data, I’m just going to set url =
, and let me just copy the pieces of this URL in so you don’t have to watch me type it all.
05:12
And we’ll set cols = ['sex', 'length', 'diam', 'height', 'weight', 'rings']
.
05:32
And because this is the abalone data set, we’ll call that abalone
and just do pd.read_csv()
, pass in the url
, usecols
, and we grab the [0, 1, 2, 3, 4, 8]
.
05:55
And set names
equal to cols
.
06:00 Now that that’s read, let’s just take a look at this DataFrame.
06:06 Cool! You can see there’s no dimensions that printed out here. Let’s see if I can get this a little larger. Now instead of having 50 rows, you can see there’s seven, fourteen printed out.
06:19 Everything’s been rounded just to four decimal places. And you’ve been able to make some clear changes to the formatting of how this DataFrame prints out.
06:27
And just to make sure that nothing persists, let’s close out of the terminal, open up a new one, jump right into the Python interpreter, import pandas as pd
, and let’s do pd.get_option('display.max_rows')
, and you can see it’s back to 60
.
06:53 So, there you go! Now you know how to set up a startup file for Pandas that’ll run every time you start your interpreter. These are just a couple of examples of the options that you can set, so you should look at the Pandas documentation if you’re interested in seeing what else you can change. Thanks for watching.
udayguntupalli on Nov. 8, 2019
@Joe: I loved all of these. Really handy. Could you kindly clarify or point to any resources on how to customize interpreter on Windows for Pycharm.
Ranit Pradhan on April 9, 2020
Sir, which url you have copied here ?
farlesh1000 on April 13, 2020
Could you kindly clarify or point to any resources on how to customize interpreter on Windows for Spyder.
Mike Allan Nillo on June 17, 2020
Is it also possible to apply this configuration options to Jupyter Notebook or JupyterLab? It seems like super handy! :)
Become a Member to join the conversation.
Pygator on Sept. 3, 2019
Interesting, what is the purpose of export and the other all caps variable set equal to the config file we made? How do we have it set for the package every time, as the default?