Re-create a New Index After Concatenation

Combining Data in pandas With concat() and merge() Martin Breuss 03:47

00:45 I can see here, ignore_index takes a Boolean value that is, by default, False. And if you pass this one instead with True, then pandas is not going to keep the labels here—0, 1, 2, 3, 0, 1, 2—or on the column axis, depending on which way you are concatenating.

01:04 But instead, it’s going to reassign it to increasing numbers from 0. So let’s take a look, not make it too theoretical. If I just stick this together in the most straightforward way with fruits, vegetables—veggies—

01:24 you get this DataFrame where the row labels repeat. Now, if you instead pass ignore_index=True, you’ll see that pandas drops that information.

01:36 So it doesn’t remember that tomato here used to be row label 0 in the original vegetables DataFrame, but it just starts renumbering them starting from 0 up to however many rows there are, and the same counts for the columns.

01:53 So if you

01:57 concatenate by columns instead—

02:01 remember you get repeat column names—but if you instead pass, ignore_index=True, then pandas is going to discard that information and just label it starting from 0 to however many columns there are. As you can see, this brings some data loss with it because now you don’t actually know anymore what those column names were, which might be a problem.

02:25 And same that maybe the unique identifier that you used as labels for your rows is going to get lost if you use it on the rows axis. So you might only want to do this if you really don’t care about those original labels.

03:10 So creating a multi-index using a keys argument or just dropping that information and letting pandas do the numbering by using ignore_index are two different ways that you can use to deal with the ambiguity that might come from having repeat labels, either in your column labels or in your row labels.

03:30 That wraps up using these additional arguments for pd.concat(). In the next lesson, you’re going to look at one final of the optional arguments that is going to lead over to the .join() and merge() functionality that we’ll talk about later.

Become a Member to join the conversation.