At this point, you’ve seen more than a handful of functions and methods to choose from for plotting a Python histogram. How do they compare? In short, there is no one-size-fits-all answer. Here’s a recap of the functions and methods you’ve covered so far, all of which relate to breaking down and representing distributions in Python:
You Have/Want To | Consider Using | Note(s) |
---|---|---|
Clean-cut integer data housed in a data structure such as a list, tuple, or set, and you want to create a Python histogram without importing any third party libraries. | collections.Counter() from the Python standard library offers a fast and straightforward way to get frequency counts from a container of data. |
This is a frequency table, so it doesn’t use the concept of binning as a “true” histogram does. |
Large array of data, and you want to compute the “mathematical” histogram that represents bins and the corresponding frequencies. | NumPy’s np.histogram() and np.bincount() are useful for computing the histogram values numerically and the corresponding bin edges. |
For more, check out np.digitize() . |
Tabular data in Pandas’ Series or DataFrame object. |
Pandas methods such as Series.plot.hist() , DataFrame.plot.hist() , Series.value_counts() , and cut() , as well as Series.plot.kde() and DataFrame.plot.kde() . |
Check out the Pandas visualization docs for inspiration. |
Create a highly customizable, fine-tuned plot from any data structure. | pyplot.hist() is a widely used histogram plotting function that uses np.histogram() and is the basis for Pandas’ plotting functions. |
Matplotlib, and especially its object-oriented framework, is great for fine-tuning the details of a histogram. This interface can take a bit of time to master, but ultimately allows you to be very precise in how any visualization is laid out. |
Pre-canned design and integration. | Seaborn’s distplot() , for combining a histogram and KDE plot or plotting distribution-fitting. |
Essentially a “wrapper around a wrapper” that leverages a Matplotlib histogram internally, which in turn utilizes NumPy. |
With that, best of luck creating histograms in the wild. Whatever you do, just don’t use a pie chart!
Congratulations, you made it to the end of the course! What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment in the discussion section and let us know.
KB on Dec. 7, 2022
Can you update this video in light of ‘distplot’ being a deprecated function that will be removed in seaborn v0.14.0?