 Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Plotting arrays with randomly generated numbers is great for learning, but the real fun comes when you can visualize large sets of data. In this video, you’ll be working with a very large file that contains macroeconomic California housing data.

You’re going to use `numpy` to extract only what you need into one-dimensional arrays. Then, you’ll plot that data with `matplotlib` and learn more about advanced grid spacing. rinafleisch

Thank you for explaining everything so well in regard to making more complex plots. It was extremely useful. I wanted mention that I was curious as to why the newer homes had less value than the older homes. I had a look at the cal.housing.domain for a key to the entries, and it looks like what is actually being plotted is home value as a function of area median income (x, thousands?) and area total bedrooms (y). Nevertheless, it is a beautiful figure. ab

Hey, I get the following error. Does someone know how to fix it? Thanks!

``````url = 'https://ndownloader.figshare.com/files/5976036'
fpath = 'CaliforniaHousing/cal_housing.data'

with tarfile.open(mode='r', fileobj=b) as archive:
housing = np.loadtxt(archive.extractfile(fpath), delimiter=',')

value = housing[:, -1]
pop, age = housing[:, [4, 7]].T

ax.text(.55, .8, text,
horizontalalignment='center',
transform=ax.transAxes,
bbox=dict(facecolor='white', alpha=0.6),
fontsize=12.5)

gridsize = (3,2)
fig = plt.figure(figsize=(12,8))
ax1 = plt.subplot2grid(gridsize, (0,0) colspan=2, rowspan=2)
ax2 = plt.subplot2grid(gridsize, (2,0))
ax3 = plt.subplot2grid(gridsize, (2,1))

ax1.set_title('Home value as a function of home age(x) & area population (y)',
fontsize=14)
sctr = ax1.scatter(x=age, y=pop, c=value, cmap ='RdYlGn')
plt.colorbar(sctr, ax=ax1, format='\$%d')
ax1.set_yscale('log')
ax2.hist(age, bins='auto')
ax3.hist(pop, bins='auto', log=True)

add_innerbox(ax2, 'Histogram: home age')
add_innerbox(ax3, 'Histogram: area population(log scl.)')
plt.show()
---------------------------------------

File "<ipython-input-75-5d4925586988>", line 24
ax1 = plt.subplot2grid(gridsize, (0,0) colspan=2, rowspan=2)
^
SyntaxError: invalid syntax
`````` Bartosz Zaczyński RP Team

@ab It looks like you’ve got a missing comma.

Expected:

``````ax1 = plt.subplot2grid(gridsize, (0,0), colspan=2, rowspan=2)
``````

Actual:

``````ax1 = plt.subplot2grid(gridsize, (0,0) colspan=2, rowspan=2)
`````` alberto10024

Hi..how do I modify the code to run it in a notebook? When I run the cell, with the last line:

``````add_titlebox(ax3, 'Histogram: area population (log scl.)')
``````

I get a single empty object `<AxesSubplot:>`

Thanks! alberto10024

Never mind about my earlier question - I sorted it (there seem to have been a conflict with the code i entered earlier) Thanks! yennjang

Hi, just want to highlight that there is a missing comma in the video on the following line:

Actual:

``````ax1 = plt.subplot2grid(gridsize, (0,0) colspan=2, rowspan=2)
``````

Expected:

``````ax1 = plt.subplot2grid(gridsize, (0,0), colspan=2, rowspan=2)
``````

I just figured it out because the line with missing comma just won’t run on my Jupyter notebook. So I looked through the Matplotlib documentations and found out that there should be a comma. Good learning experience though, realizing that we should be referring to Matplotlib documentations while attempting this course, and not just rely on the video alone. Bartosz Zaczyński RP Team

@yennjang Thanks for catching this 😊 Dawn0fTime

FYI this may fail initially on the Mac due to an SSL error. Open the Python folder for whichever version you’re using under Applications. Double-click ‘Install Certificates.commmand’.

to join the conversation.