# Math and Statistics Functions

In this lesson, you’ll learn about new and improved `math`

and `statistics`

functions in Python 3.8. Python 3.8 brings many improvements to existing standard library packages and modules. `math`

in the standard library has a few new `functions`

. `math.prod()`

works similarly to the built-in `sum()`

, but for multiplicative products:

```
>>> import math
>>> math.prod((2, 8, 7, 7))
784
>>> 2 * 8 * 7 * 7
784
```

The two statements are equivalent. `prod()`

will be easier to use when you already have the factors stored in an iterable.

Another new function is `math.isqrt()`

. You can use `isqrt()`

to find the integer part of square roots:

```
>>> import math
>>> math.isqrt(9)
3
>>> math.sqrt(9)
3.0
>>> math.isqrt(15)
3
>>> math.sqrt(15)
3.872983346207417
```

The square root of 9 is 3. You can see that `isqrt()`

returns an `integer`

result, while `math.sqrt()`

always returns a `float`

. The square root of 15 is almost 3.9. Note that `isqrt()`

truncates the answer down to the next integer, in this case `3`

.

Finally, you can now more easily work with *n*-dimensional points and vectors in the standard library. You can find the distance between two points with `math.dist()`

, and the length of a vector with `math.hypot()`

:

```
>>> import math
>>> point_1 = (16, 25, 20)
>>> point_2 = (8, 15, 14)
>>> math.dist(point_1, point_2)
14.142135623730951
>>> math.hypot(*point_1)
35.79106033634656
>>> math.hypot(*point_2)
22.02271554554524
```

This makes it easier to work with points and vectors using the standard library. However, if you will be doing many calculations on points or vectors, you should check out NumPy.

The `statistics`

module also has several new functions:

calculates the mean of float numbers.`statistics.fmean()`

calculates the geometric mean of float numbers.`statistics.geometric_mean()`

finds the most frequently occurring values in a sequence.`statistics.multimode()`

calculates cut points for dividing data into`statistics.quantiles()`

*n*continuous intervals with equal probability.

The following example shows the functions in use:

```
>>> import statistics
>>> data = [9, 3, 2, 1, 1, 2, 7, 9]
>>> statistics.fmean(data)
4.25
>>> statistics.geometric_mean(data)
3.013668912157617
>>> statistics.multimode(data)
[9, 2, 1]
>>> statistics.quantiles(data, n=4)
[1.25, 2.5, 8.5]
```

In Python 3.8, there is a new `statistics.NormalDist`

class that makes it more convenient to work with the Gaussian normal distribution. To see an example of using `NormalDist`

, you can try to compare the speed of the new `statistics.fmean()`

and the traditional `statistics.mean()`

:

```
>>> import random
>>> import statistics
>>> from timeit import timeit
>>> # Create 10,000 random numbers
>>> data = [random.random() for _ in range(10_000)]
>>> # Measure the time it takes to run mean() and fmean()
>>> t_mean = [timeit("statistics.mean(data)", number=100, globals=globals())
... for _ in range(30)]
>>> t_fmean = [timeit("statistics.fmean(data)", number=100, globals=globals())
... for _ in range(30)]
>>> # Create NormalDist objects based on the sampled timings
>>> n_mean = statistics.NormalDist.from_samples(t_mean)
>>> n_fmean = statistics.NormalDist.from_samples(t_fmean)
>>> # Look at sample mean and standard deviation
>>> n_mean.mean, n_mean.stdev
(0.825690647733245, 0.07788573997674526)
>>> n_fmean.mean, n_fmean.stdev
(0.010488564966666065, 0.0008572332785645231)
>>> # Calculate the lower 1 percentile of mean
>>> n_mean.quantiles(n=100)[0]
0.6445013221202459
```

In this example, you use `timeit`

to measure the execution time of `mean()`

and `fmean()`

. To get reliable results, you let `timeit`

execute each function 100 times, and collect 30 such time samples for each function. Based on these samples, you create two `NormalDist`

objects. Note that if you run the code yourself, it might take up to a minute to collect the different time samples.

`NormalDist`

has many convenient attributes and methods. See the documentation for a complete list. Inspecting `.mean`

and `.stdev`

, you see that the old `statistics.mean()`

runs in 0.826 ± 0.078 seconds, while the new `statistics.fmean()`

spends 0.0105 ± 0.0009 seconds. In other words, `fmean()`

is about 80 times faster for these data.

If you need more advanced statistics in Python than the standard library offers, check out `statsmodels`

and `scipy.stats`

.

**00:00**
This video covers the new and improved `math`

and `statistics`

functions. So, what’s new in the standard library for `math`

?

**00:08**
`math.prod()`

returns the product of an iterable of numbers, so kind of like a `sum()`

where it would add all the iterables together, but for multiplication. Go ahead and import `math`

.

**00:23**
To use `prod()`

, it’s going to take an iterable and it’s going to calculate the product of all the elements in the input iterable. So in this case, you can give it a tuple of numbers here. It would be the same as multiplying all these items together. Those two statements would be equivalent.

**00:44**
What’s nice about `prod()`

is it will be easier for you to use when the factors you have are stored in an iterable. The `isqrt()`

(*i* square root), or `isqrt()`

, returns the integer part of square roots.

**00:58**
Take note: it truncates the answer down to the next integer. It’s not doing a rounding. The `isqrt()`

is going to return the integer part for the square root.

**01:11**
So in this case, returning the integer `3`

, it would be the same for the square root of `9`

, though the `sqrt()`

(square root) function returns a `float`

. If you were to take the square root, instead, of `15`

, here you’re going to get a very different answer than using the standard `sqrt()`

function. And again, it’s returning *just* the integer part of it and completely truncating the answer down—not rounding.

**01:37**
`dist()`

computes the Euclidean distance between two points. Create a couple of points, make these three dimensional. Okay. `point_1`

and `point_2`

.

**01:52**
So what’s the distance—it’s going to find the Euclidean distance—between these two points? Putting in `point_1`

and `point_2`

, giving you the distance as a float. And `hypot()`

, which stands for *hypotenuse*, returns the Euclidean norm.

**02:11**
This was expanded in this version to handle multiple dimensions. Previously it was only two-dimensional. Now for `hypot()`

, as I was noting earlier, `hypot()`

is now multidimensional.

**02:23**
You start with an asterisk (`*`

) and then put in the coordinates. So in this case, `*point_1`

, and again, you could try it again with `*point_2`

.

**02:37**
And note that if you’re working with many points or vectors, you really should check out NumPy. What’s new in the `statistics`

module? `statistics.fmean()`

calculates the mean of `float`

numbers.

**02:57**
Go ahead and import `statistics`

, create some data in a list.

**03:07**
`fmean()`

is going to convert all the data to `floats`

and then compute the mean. Go ahead and give it the list of `data`

. The mean would be `4.25`

.

**03:20**
`statistics.geometric_mean()`

calculates the geometric mean for `float`

numbers.

**03:30**
For `geometric_mean()`

it’s going to convert the data to floats and then compute the geometric mean instead. So again, give it the list of `data`

.

**03:42**
`multimode()`

finds the most frequently occurring values in a sequence. So here’s `multimode()`

. It’s going to return a list of the most frequently occurring values.

**03:56**
So putting in `data`

, you can see here that `9`

, `2`

, and `1`

are all returned because they all occur two times.

**04:09**
`quantiles()`

calculates cut points for dividing data into *n* continuous intervals with equal probability.

**04:22**
For `quantiles()`

, it’s going to divide the data up into *n* continuous intervals. We’re entering in `data, n=4`

, so it’ll divide it up into `4`

, in this case, quartiles.

**04:35**
And it returns the three cut points for the four segments.

**04:44**
And there’s a new normal distribution—`NormalDist`

—class that makes it more convenient to work with the Gaussian normal distribution. There’s another example in the text below this video that you can use to try out this new `NormalDist`

class.

**05:01**
If you’re doing a lot with statistics, you may have a need to do something more advanced than what the standard library offers. In that case, you may want to check out these packages: `statsmodels`

and `scipy.stats`

.

**05:17**
In the next video, I’ll show you the new warnings about dangerous syntax.

Become a Member to join the conversation.