Loading video player…

Create the Dataset

00:00 In this lesson, you’ll get to know the example dataset, and you will see it’s a very healthy example dataset because it’ll consist of fruits and vegetables.

00:10 And the dataset is really going to be quite small, and it will consist of two DataFrames. The first one is going to be sourced by your orchard. So this is going to be a couple of fruits and their respective emoji images.

00:24 And the second one is going to be some vegetables sourced from your garden, and you might already see that this will nicely load into DataFrames the way that I created it here, and you have this name and image, and then a couple of rows for each of those.

00:46 Not great drawing, but you can see where I’m going with this, and I’ll let pandas do the drawing because it’s actually a little better at it than I am. So let’s go ahead and load this into pandas. For that, I’m going to first have to import pandas, and I will use the common pd shortcut to then work with it. And then in here, I’ll just create two DataFrames. The first one I’m calling fruits,

01:14 I pass in the orchard, and then do the same again for the garden and call the DataFrame veggies.

01:27 And here we are. So I’ve loaded these two lists of dictionaries into DataFrames. And now I can go ahead and let pandas draw the tables for me. You can see the fruits DataFrame consists of two columns.

01:42 One is name. The other one is image. And then it has an index column as well. And here you have the name of the fruit and then the emoji image, and then the same or similar thing for the vegetables as well.

01:55 You have a name and an image column, but then you also have this additional color column. And you will notice that the fruits DataFrame has one additional row, so it’s got four rows, and the veggies DataFrame has only three rows. And this small difference is between them so that there’s an additional column in one that the other one doesn’t have.

02:14 And also, as you might notice, there’s a tomato in both of them. So we’ll use these features of these DataFrames to try to visually exemplify what the merge and concatenation operations do in pandas, respectively.

02:29 And that’s really all to say about this DataFrame, except maybe if you’re wondering, tomatoes are fruits, but they’re also generally used as vegetables. So this is kind of the funny little joke that I’m doing here to use as an example, but I think it’s really going to help in showing how this merge() and concat() works in pandas.

02:48 Here’s another view of the code you used to generate the dataset.

02:53 That wraps up the first section that was about getting started. In the next section, you’ll begin to combine data using pd.concat(), and specifically, you’ll begin with concatenating two DataFrames along the axis.

Avatar image for shoebptl

shoebptl on Sept. 1, 2023

Thank you for using Jupyter notebook. I think, after a long time, I am looking at Jupyter notebook in Real Python.

Become a Member to join the conversation.