Setting Up Your Environment
00:12 You’ll also need a working Python environment including pandas. If you don’t have one yet, then you have several options. If you have more ambitious plans, then download the Anaconda distribution.
00:25 It’s huge, at about 500 megabytes, but you will be equipped for most data science work. If you prefer a minimalist setup, then check out the section on installing Miniconda in this Real Python course, Setting Up Python for Machine Learning on Windows.
01:16 Once your environment is set up, you’re ready to download a dataset. In this video course, you’re going to analyze data on college majors, originally sourced from the American Community Survey 2010-2012 Public Use Microdata Sample.
Once everything is installed, you can run a Notebook by typing
jupyter notebook in the terminal. Note that this will start up a web server and open your browser to allow you to see the Notebooks in the current folder.
02:03 You can create a new one by going to New and picking Python 3. Rename it by clicking the title, which starts out as Untitled, and entering the name that you want the workbook to have. It’s important to save your work, and this is done by File > Save and Checkpoint or by using the keyboard shortcut which is appropriate for your operating system, shown in the menu onscreen.
02:26 Help is available by pressing H, and as you can see, there are lots of shortcuts and it’s useful to learn as many of these as you can. The important concept with Jupyter is that there are two modes, there’s command mode and there’s edit mode.
02:54 or markup, which allows you to enter richly-formatted text in cells which aren’t run by Python. With a cell active, such as this one with the flashing cursor here, you can change into command mode by hitting Escape and change the cell’s contents by pressing M to move into Markdown mode and Y to move back into code mode. Tab will normally take you back to the cell to enter code, but sometimes you’ll need to click in it with the mouse.
03:28 If you want to generate a new cell, there are a number of ways of doing so but the easiest way is to run the code in the final cell and create a new one at the same time by pressing Shift and tapping Enter. That runs the current cell, and you can see the number 2 appears next to it to show that’s the second cell that’s been run. And a new cell is ready underneath, where you can enter some more commands and run them!
03:52 You should be able to run all of the code in this course using just these few shortcuts, but if you want to learn more, Real Python has got you covered with this course on using Jupyter Notebooks.
04:46 You can follow along with this course even if you aren’t familiar with DataFrames, but if you’re interested in learning more about working with pandas and DataFrames, then you can check out Using Pandas in Python to Explore Your Dataset and The Pandas DataFrame: Make Working With Data Delightful.
Become a Member to join the conversation.