NumPy and Pandas
00:00
In the previous lesson, I showed you some common coding cases of len()
. In this lesson, I’ll show you how to use two third-party libraries and how they use len()
.
00:11 NumPy is a popular scientific calculation library for Python. It is written using C-extensions, meaning the code is quite performant. This library does all sorts of mathy collection stuff, including multi-dimensional arrays and vectors.
00:25
It can also help you with your calculations, having features for linear algebra, Fourier transforms, and many of the other things that still haunt my nightmares from engineering school. NumPy is a third party library, and so you’ll need to use pip
to install it. As always with this kind of stuff, it’s best practice to use a virtualenv to do so. Let’s take a look at NumPy and the len()
function. First off, I’ll start with a single dimensional array.
00:52
That’s a very list like thing. I’ll import NumPy as NP
as that’s shorter to type. Then I’ll create a NumPy array from a list.
01:12 There it is. And to be more specific, you can see its type. It’s a NumPy array. And what would this course online be without … and that’s kind of expected.
01:29 Let’s take it up another notch and add a second dimension.
01:42 Note the list of lists here. Creating the 2D array from it …
01:51 and here it is … and then the length. You might be thinking, “Great! That’s the dimension size.” But it isn’t. And I’ll show you in a second when I add a third dimension.
02:04
NumPy arrays have a property called .shape
that shows you the lengths of the things inside of them. To get the number of dimensions, you do len()
on the .shape
property.
02:19
shape
is like length for each dimension, and sinse it returns a tuple, the length of that tuple is the number of dimensions. You can get at the same thing through the .ndim
property.
02:33 I believe I promised a third dimension. Put on your red and blue glasses, and get ready for a shark. Wow. Is that a dated reference? You see, back in my day—You know what?
02:44 Never mind. Google “3D Jaws” and figure it out for yourself. Where was I? Oh, right. Three dimensions.
03:02 List of lists of lists this time … and the NumPy array …
03:14
and with len()
… same result as with the 2D. What this is doing is returning the length of the first dimension, which in both the 2D and 3D examples was two. Using .shape
again, you can see the three dimensions, and .ndim
, or the length of the shape, and that gives you how many D your 3D is.
03:48
Another very common third-party library is Pandas. This one is for doing data crunching. It’s built on top of NumPy, so it is also quite speedy. Its key component is the DataFrame
object, which is a dictionary on steroids.
04:02
Just a quick pip
install into your virtualenv, and you’re ready to go. Let’s go back into the REPL. importing the fuzzy bear as a shorter PD
… creating the dictionary to populate a DataFrame …
04:41
and there it is. Each list in the dictionary becomes a value in the row of the DataFrame. The index
property specifies the name of the row. Looking at the data itself, you can see Neo does everything well, Cypher needs to stay after school because his loyalty grade is … got some work to do. And here’s what you came for.
05:05
Running len()
on a frame returns the number of rows. In this case, three: Hacking
, Kungfu
, and Loyalty
. Like NumPy, Panda’s DataFrame has a .shape
property.
05:17
It shows the number of rows and columns as a tuple. Next up, you’ll see how the writers of NumPy and Pandas used the special __len__()
method to get len()
to work with their classes.
Become a Member to join the conversation.