# Creating Columns With Arithmetic Operations and NumPy

**00:00**
You can apply basic arithmetic operations such as addition, subtraction, multiplication, and division to pandas `Series`

and `DataFrame`

objects in pretty much the same way as you would with NumPy arrays. So, for example, with a NumPy array, you could take a column and multiply the whole column by `2`

, and you could do the same thing with a DataFrame.

**00:22**
Let’s take the `'js-score'`

column, and if we simply multiply it times `2`

, we get a `Series`

object where all of the entries in the `js-score`

column are multiplied by `2`

.

**00:35**
We can bring that `2 *`

at the front, and that’ll give us the same thing. We can also do division. So, say, divide it by `4`

.

**00:48**
And we can also add a couple of columns. So, for example, let’s add the `'js-score'`

column and the `'py-score'`

column.

**00:56**
So here, we’re taking two columns, and these are going to be two pandas `Series`

objects. They’re going to have the same index, and so pandas will just know how to match up the indices and add up the corresponding elements.

**01:10**
Now, these basic arithmetic operations that we can do on columns, we can use this technique to create new columns—say, by doing some sort of linear combinations of the columns.

**01:22**
So, for example, let’s suppose that I take the `'js-score'`

and the `'py-score'`

, and I also want to add up the `'django-score'`

. Maybe here, the idea is that we want to find some sort of average, and maybe the `py-score`

—that’s going to be worth, say, 40% of the average.

**01:41**
And then maybe the other two scores are going to be worth `0.3`

, so we can bring in those numbers and these multiplication operations. This is going to give us a new Series. And maybe what we want to do is save the series as a column in our DataFrame, and that would, give us a total score for our job candidates.

**02:02**
Let’s create a new column, call it `'total'`

, and we’ll create it using this arithmetic operation. So let’s run that, and then let’s take a look at our DataFrame, and so now we’ve got this sort of total score based on all of the columns in the DataFrame relating to the score for each of the candidates.

**02:23**
Now, in addition to using just the basic arithmetic operations, you can also use most NumPy and SciPy routines to pandas `Series`

and `DataFrame`

objects. So, let me show you another way that we could have done this.

**02:37**
I’m going to create a pandas `Series`

object, and I’m going to call it `wgts`

(weights). This is going to be basically keeping track of the weights of the individual tests, and that’ll give us another way to compute this `total`

column.

**02:52**
So again, we have the data, the data is going to be, say, `0.4`

, `0.3`

, and `0.3`

, and the index is going to be… Well, we want the `0.4`

to be for the `'py-score'`

.

**03:05**
And then for the other scores, we want those to be the ones for `0.3`

.

**03:14**
All right, let’s take a look at that.

**03:17**
Then what we’ll do is, this `Series`

object, the index is the exact same as the column labels that we want to work with. So what we could do easily is simply, from the DataFrame, pull out the columns that we want. And these columns are the ones from the index of the `wgts`

, right?

**03:39**
The index of the `wgts`

is going to be `'py-score'`

, `'django-score'`

, and `'js-score'`

. So just for you to see that, we get those score columns.

**03:48**
And then if we just multiply this by the `wgts`

Series, pandas knows that what we want to do here is take the `py-score`

value in the `wgts`

pandas `Series`

object and multiply the column with the `py-score`

values, and similarly for the `django-score`

and the `js-score`

. And so this creates a DataFrame.

**04:09**
Then what we want to do is use the `sum()`

function in NumPy. Maybe we should import `numpy`

first, so let’s go `import numpy as np`

. In here, what we want to do is we want to take the `np.sum()`

function.

**04:27**
And by default, this is going to sum along each individual column. So what we’re going to get here are three values for the `py-score`

, the `django-score`

, and the `js-score`

.

**04:38**
So in other words, we fix a column, and this is going to add up along the rows once we fix a column. So let me just show you that. We got that. Let me move this over here so that we’re not getting this exact same line.

**04:53**
Let’s just run that here. Now, if you instead want to sum along the rows—in other words, you fix a row and you sum the entries of that row—you’ll want to pass in a value of `1`

to the `axis`

.

**05:07**
So you’re basically saying “Sum along the columns,” right? We want to fix a row, sum it along the columns. That gives us, then, the total score in another way. And if we compare that over here, we’ve got `50.6`

for `Xavier`

, and we’ve got `67`

, and so on, and that’s exactly what we are getting over here.

**05:30**
So, this would give us another way to define, or to create, that `total`

column in the DataFrame by combining the fact that we can multiply `Series`

objects with `DataFrame`

objects and use any of the NumPy basic routines on DataFrames.

**05:49**
That gives us the exact same thing as we had before.

**05:56**
All right! So, these are a few of the many things that you can do in pandas by combining basic arithmetic operations and some of the built-in NumPy routines on pandas Series and pandas DataFrames to use them to possibly create new columns in your DataFrame. All right, up next, we’ll take a look at sorting a pandas DataFrame.

**Anonymous** on Sept. 30, 2021

A minimal example works fine for me:

```
df = pd.DataFrame({'c1': [5,8,0], 'c2': [1,2,3]})
df['quotient'] = df['c1']/df['c2']
```

I get

```
>>> df
c1 c2 quotient
0 5 1 5.0
1 8 2 4.0
2 0 3 0.0
```

Maybe you did something more complicated. Anyway, here’s a great explanation of the dreaded SettingWithCopyWarning (and it *is* just a warning, not an error): [www.dataquest.io/blog/settingwithcopywarning/]

**hwoarang09** on April 28, 2022

i want to know the difference between the two

df[‘total’] = np.sum(df[wgts.index] * wgts, axis=1)

df[‘total2’] = (df*wgts).sum(axis=1)

should i use np.sum???? what is the possible error of second code?? thanks!

Become a Member to join the conversation.

BadgerPaulon Aug. 29, 2021I am trying to perform an arithmetic operation similar to that in the Creating Columns With Arithmetic Operations and NumPy lesson (at minute 2:25). In my case, I am dividing one column of my DataFrame by another column in the same DataFrame:per_capita[“Deaths per 100,000 Pop”] = per_capita[“Total Deaths”] / per_capita[“Population”]

**While the requested calculations are made and a new column is created, I receive the following error statement: **

“C:\Users\paulm\anaconda3\envs\pandas_playground\lib\site-packages\pandas\core\frame.py:3607: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead**

See the caveats in the documentation: (pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy) self._set_item(key, value)”**

**I am not experienced enough to understand what I did in error or what the error statement is requesting that I do to correct that error. I am using Pandas and Jupyter Notebook. Thoughts? Advice? **