Python’s reduce()
is a function that implements a mathematical technique called folding or reduction. reduce()
is useful when you need to apply a function to an iterable and reduce it to a single cumulative value. Python’s reduce()
is popular among developers with a functional programming background, but Python has more to offer.
In this tutorial, you’ll cover how reduce()
works and how to use it effectively. You’ll also cover some alternative Python tools that can be more Pythonic, readable, and efficient than reduce()
.
In this tutorial, you’ll learn:
- How Python’s
reduce()
works - What the more common reduction use cases are
- How to solve these use cases using
reduce()
- What alternative Python tools are available to solve these same use cases
With this knowledge, you’ll be able to decide which tools to use when it comes to solving reduction or folding problems in Python.
For a better understanding of Python’s reduce()
, it would be helpful to have some previous knowledge of how to work with Python iterables, especially how to loop over them using a for
loop.
Free Download: Get a sample chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.
Exploring Functional Programming in Python
Functional programming is a programming paradigm based on breaking down a problem into a set of individual functions. Ideally, every function only takes a set of input arguments and produces an output.
In functional programming, functions don’t have any internal state that affects the output that they produce for a given input. This means that anytime you call a function with the same set of input arguments, you’ll get the same result or output.
In a functional program, input data flows through a set of functions. Each function operates on its input and produces some output. Functional programming tries to avoid mutable data types and state changes as much as possible. It works with the data that flow between functions.
Other core features of functional programming include the following:
- The use of recursion rather than loops or other structures as a primary flow control structure
- A focus on lists or arrays processing
- A focus on what is to be computed rather than on how to compute it
- The use of pure functions that avoid side effects
- The use of higher-order functions
There are several important concepts in this list. Here’s a closer look to some of them:
-
Recursion is a technique in which functions call themselves, either directly or indirectly, in order to loop. It allows a program to loop over data structures that have unknown or unpredictable lengths.
-
Pure functions are functions that have no side effects at all. In other words, they’re functions that do not update or modify any global variable, object, or data structure in the program. These functions produce an output that depends only on the input, which is closer to the concept of a mathematical function.
-
Higher-order functions are functions that operate on other functions by taking functions as arguments, returning functions, or both, as with Python decorators.
Since Python is a multi-paradigm programming language, it provides some tools that support a functional programming style:
- Functions as first-class objects
- Recursion capabilities
- Anonymous functions with
lambda
- Iterators and generators
- Standard modules like
functools
anditertools
- Tools like
map()
,filter()
,reduce()
,sum()
,len()
,any()
,all()
,min()
,max()
, and so on
Even though Python isn’t heavily influenced by functional programming languages, back in 1993 there was a clear demand for some of the functional programming features listed above.
In response, several functional tools were added to the language. According to Guido van Rossum, they were contributed by a community member:
Python acquired
lambda
,reduce()
,filter()
andmap()
, courtesy of (I believe) a Lisp hacker who missed them and submitted working patches. (Source)
Over the years, new features such as list comprehensions, generator expressions, and built-in functions like sum()
, min()
, max()
, all()
, and any()
were viewed as Pythonic replacements for map()
, filter()
, and reduce()
. Guido planned to remove map()
, filter()
, reduce()
, and even lambda
from the language in Python 3.
Luckily, this removal didn’t take effect, mainly because the Python community didn’t want to let go of such popular features. They’re still around and still widely used among developers with a strong functional programming background.
In this tutorial, you’ll cover how to use Python’s reduce()
to process iterables and reduce them to a single cumulative value without using a for
loop. You’ll also learn about some Python tools that you can use in place of reduce()
to make your code more Pythonic, readable, and efficient.
Getting Started With Python’s reduce()
Python’s reduce()
implements a mathematical technique commonly known as folding or reduction. You’re doing a fold or reduction when you reduce a list of items to a single cumulative value. Python’s reduce()
operates on any iterable—not just lists—and performs the following steps:
- Apply a function (or callable) to the first two items in an iterable and generate a partial result.
- Use that partial result, together with the third item in the iterable, to generate another partial result.
- Repeat the process until the iterable is exhausted and then return a single cumulative value.
The idea behind Python’s reduce()
is to take an existing function, apply it cumulatively to all the items in an iterable, and generate a single final value. In general, Python’s reduce()
is handy for processing iterables without writing explicit for
loops. Since reduce()
is written in C, its internal loop can be faster than an explicit Python for
loop.
Python’s reduce()
was originally a built-in function (and still is in Python 2.x), but it was moved to functools.reduce()
in Python 3.0. This decision was based on some possible performance and readability issues.
Another reason for moving reduce()
to functools
was the introduction of built-in functions like sum()
, any()
, all()
, max()
, min()
, and len()
, which provide more efficient, readable, and Pythonic ways of tackling common use cases for reduce()
. You’ll learn how to use them in place of reduce()
later in the tutorial.
In Python 3.x, if you need to use reduce()
, then you first have to import the function into your current scope using an import
statement in one of the following ways:
import functools
and then use fully-qualified names likefunctools.reduce()
.from functools import reduce
and then callreduce()
directly.
According to the documentation for reduce()
, the function has the following signature:
functools.reduce(function, iterable[, initializer])
The Python documentation also states that reduce()
is roughly equivalent to the following Python function:
def reduce(function, iterable, initializer=None):
it = iter(iterable)
if initializer is None:
value = next(it)
else:
value = initializer
for element in it:
value = function(value, element)
return value
Like this Python function, reduce()
works by applying a two-argument function to the items of iterable
in a loop from left to right, ultimately reducing iterable
to a single cumulative value
.
Python’s reduce()
also accepts a third and optional argument called initializer
that provides a seed value to the computation or reduction.
In the next two sections, you’ll take an in-depth look at how Python’s reduce()
works and the meaning behind each of its arguments.
The Required Arguments: function
and iterable
The first argument to Python’s reduce()
is a two-argument function conveniently called function
. This function will be applied to the items in an iterable to cumulatively compute a final value.
Even though the official documentation refers to the first argument of reduce()
as “a function of two arguments,” you can pass any Python callable to reduce()
as long as the callable accepts two arguments. Callable objects include classes, instances that implement a special method called __call__()
, instance methods, class methods, static methods, and functions.
Note: For more details about Python callable objects, you can check out the Python documentation and scroll down to “Callable types.”
The second required argument, iterable
, will accept any Python iterable, as its name suggests. This includes lists, tuples, range
objects, generators, iterators, sets, dictionary keys and values, and any other Python objects that you can iterate over.
Note: If you pass an iterator to Python’s reduce()
, then the function will need to exhaust the iterator before you can get a final value. So, the iterator at hand won’t remain lazy.
To understand how reduce()
works, you’re going to write a function that computes the sum of two numbers and prints the equivalent math operation to the screen. Here’s the code:
>>> def my_add(a, b):
... result = a + b
... print(f"{a} + {b} = {result}")
... return result
This function calculates the sum of a
and b
, prints a message with the operation using an f-string, and returns the result of the computation. Here’s how it works:
>>> my_add(5, 5)
5 + 5 = 10
10
my_add()
is a two-argument function, so you can pass it to Python’s reduce()
along with an iterable to compute the cumulated sum of the items in the iterable. Check out the following code that uses a list of numbers:
>>> from functools import reduce
>>> numbers = [0, 1, 2, 3, 4]
>>> reduce(my_add, numbers)
0 + 1 = 1
1 + 2 = 3
3 + 3 = 6
6 + 4 = 10
10
When you call reduce()
, passing my_add()
and numbers
as arguments, you get an output that shows all the operations that reduce()
performs to come up with a final result of 10
. In this case, the operations are equivalent to ((((0 + 1) + 2) + 3) + 4) = 10
.
The call to reduce()
in the above example applies my_add()
to the first two items in numbers
(0
and 1
) and gets 1
as the result. Then reduce()
calls my_add()
using 1
and the next item in numbers
(which is 2
) as arguments, getting 3
as the result. The process is repeated until numbers
runs out of items and reduce()
returns a final result of 10
.
The Optional Argument: initializer
The third argument to Python’s reduce()
, called initializer
, is optional. If you supply a value to initializer
, then reduce()
will feed it to the first call of function
as its first argument.
This means that the first call to function
will use the value of initializer
and the first item of iterable
to perform its first partial computation. After this, reduce()
continues working with the subsequent items of iterable
.
Here’s an example in which you use my_add()
with initializer
set to 100
:
>>> from functools import reduce
>>> numbers = [0, 1, 2, 3, 4]
>>> reduce(my_add, numbers, 100)
100 + 0 = 100
100 + 1 = 101
101 + 2 = 103
103 + 3 = 106
106 + 4 = 110
110
Since you supply a value of 100
to initializer
, Python’s reduce()
uses that value in the first call as the first argument to my_add()
. Note that in the first iteration, my_add()
uses 100
and 0
, which is the first item of numbers
, to perform the calculation 100 + 0 = 100
.
Another point to note is that, if you supply a value to initializer
, then reduce()
will perform one more iteration than it would without an initializer
.
If you’re planning to use reduce()
to process iterables that may potentially be empty, then it’s good practice to provide a value to initializer
. Python’s reduce()
will use this value as its default return value when iterable
is empty. If you don’t provide an initializer
value, then reduce()
will raise a TypeError
. Take a look at the following example:
>>> from functools import reduce
>>> # Using an initializer value
>>> reduce(my_add, [], 0) # Use 0 as return value
0
>>> # Using no initializer value
>>> reduce(my_add, []) # Raise a TypeError with an empty iterable
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: reduce() of empty sequence with no initial value
If you call reduce()
with an empty iterable
, then the function will return the value supplied to initializer
. If you don’t supply an initializer
, then reduce()
will raise a TypeError
when processing empty iterables.
Note: To dive deeper into what the Python traceback is, check out Understanding the Python Traceback.
Now that you’re familiar with how reduce()
works, you’re ready to learn how to apply it to some common programming problems.
Reducing Iterables With Python’s reduce()
So far, you’ve learned how Python’s reduce()
works and how to use it to reduce iterables using a user-defined function. You also learned the meaning of each argument to reduce()
and how they work.
In this section, you’ll look at some common use cases for reduce()
and how to solve them using the function. You’ll also learn about some alternative Python tools that you can use in place of reduce()
to make your code more Pythonic, efficient, and readable.
Summing Numeric Values
The "Hello, World!"
of Python’s reduce()
is the sum use case. It involves calculating the cumulative sum of a list of numbers. Say you have a list of numbers like [1, 2, 3, 4]
. Its sum will be 1 + 2 + 3 + 4 = 10
. Here’s a quick example of how to solve this problem using a Python for
loop:
>>> numbers = [1, 2, 3, 4]
>>> total = 0
>>> for num in numbers:
... total += num
...
>>> total
10
The for
loop iterates over every value in numbers
and accumulates them in total
. The final result is the sum of all the values, which in this example is 10
. A variable used like total
in this example is sometimes called an accumulator.
This is arguably the most common use case for Python’s reduce()
. To implement this operation with reduce()
, you have several options. Some of them include using reduce()
with one of the following functions:
- A user-defined function
- A
lambda
function - A function called
operator.add()
To use a user-defined function, you need to code a function that adds two numbers. Then you can use that function with reduce()
. For this example, you can rewrite my_add()
as follows:
>>> def my_add(a, b):
... return a + b
...
>>> my_add(1, 2)
3
my_add()
adds two numbers, a
and b
, and returns the result. With my_add()
in place, you can use reduce()
to calculate the sum of the values in a Python iterable. Here’s how:
>>> from functools import reduce
>>> numbers = [1, 2, 3, 4]
>>> reduce(my_add, numbers)
10
The call to reduce()
applies my_add()
to the items in numbers
to compute their cumulative sum. The final result is 10
, as expected.
You can also perform the same computation by using a lambda
function. In this case, you need a lambda
function that takes two numbers as arguments and returns their sum. Take a look at the following example:
>>> from functools import reduce
>>> numbers = [1, 2, 3, 4]
>>> reduce(lambda a, b: a + b, numbers)
10
The lambda
function takes two arguments and returns their sum. reduce()
applies the lambda
function in a loop to compute the cumulative sum of the items in numbers
.
Likewise, you can take advantage of Python’s operator
module. This module exports a bunch of functions that correspond to Python’s intrinsic operators. For the problem at hand, you can use operator.add()
along with Python’s reduce()
. Check out the following example:
>>> from operator import add
>>> from functools import reduce
>>> add(1, 2)
3
>>> numbers = [1, 2, 3, 4]
>>> reduce(add, numbers)
10
In this example, add()
takes two arguments and returns their sum. So, you can use add()
with reduce()
to compute the sum of all the items of numbers
. Since add()
is written in C and optimized for efficiency, it may be your best choice when using reduce()
for solving the sum use case. Note that the use of operator.add()
is also more readable than using a lambda
function.
The sum use case is so common in programming that Python, since version 2.3, has included a dedicated built-in function, sum()
, to solve it . sum()
is declared as sum(iterable[, start])
.
start
is an optional argument to sum()
and defaults to 0
. The function adds the value of start
to the items of iterable
from left to right and returns the total. Take a look at the following example:
>>> numbers = [1, 2, 3, 4]
>>> sum(numbers)
10
Since sum()
is a built-in function, you don’t need to import anything. It’s always available for you. Using sum()
is the most Pythonic way of solving the sum use case. It’s clean, readable, and concise. It follows a core Python principle:
Simple is better than complex. (Source)
The addition of sum()
to the language was a big win in terms of readability and performance as compared to using reduce()
or a for
loop.
Note: For more details on comparing the performance of reduce()
with the performance of other Python reduction tools, check out the section Performance is Key.
If you’re dealing with the sum use case, then good practice recommends the use of sum()
.
Multiplying Numeric Values
The product use case of Python’s reduce()
is quite similar to the sum use case, but this time the operation is multiplication. In other words, you need to calculate the product of all the values in an iterable.
For example, say you have the list [1, 2, 3, 4]
. Its product will be 1 * 2 * 3 * 4 = 24
. You can calculate this using a Python for
loop. Check out the following example:
>>> numbers = [1, 2, 3, 4]
>>> product = 1
>>> for num in numbers:
... product *= num
...
>>> product
24
The loop iterates over the items in numbers
, multiplying each item by the result of the previous iteration. In this case, the starting value for the accumulator product
should be 1
instead of 0
. Since any number multiplied by zero is zero, a starting value of 0
will always make your product equal to 0
.
This computation is also a quite popular use case for Python’s reduce()
. Again, you’ll cover three ways for solving the problem. You’ll use reduce()
with:
- A user-defined function
- A
lambda
function - A function called
operator.mul()
For option 1, you’ll need to code a custom function that takes two arguments and returns their product. Then you’ll use this function with reduce()
to calculate the product of the items in an iterable. Take a look at the following code:
>>> from functools import reduce
>>> def my_prod(a, b):
... return a * b
...
>>> my_prod(1, 2)
2
>>> numbers = [1, 2, 3, 4]
>>> reduce(my_prod, numbers)
24
The function my_prod()
multiplies two numbers, a
and b
. The call to reduce()
iterates over the items of numbers
and computes their product by applying my_prod()
to successive items. The final result is the product of all the items in numbers
, which in this example is 24
.
If you prefer to use a lambda
function to solve this use case, then you need a function that takes two arguments and returns their product. Here’s an example:
>>> from functools import reduce
>>> numbers = [1, 2, 3, 4]
>>> reduce(lambda a, b: a * b, numbers)
24
The anonymous function does the magic by multiplying successive items while reduce()
iterates over numbers
. Again, the result is the product of all the items in numbers
.
You can also use operator.mul()
to tackle the product use case. operator.mul()
takes two numbers and returns the result of multiplying them. This is the right functionality for solving the problem at hand. Check out the following example:
>>> from operator import mul
>>> from functools import reduce
>>> mul(2, 2)
4
>>> numbers = [1, 2, 3, 4]
>>> reduce(mul, numbers)
24
Since mul()
is highly optimized, your code will perform better if you use this function rather than a user-defined function or a lambda
function. Note that this solution is much more readable as well.
Finally, if you’re using Python 3.8, then you have access to a more Pythonic and readable solution to this use case. Python 3.8 has added a new function called prod()
, which lives in the Python math
module. This function is analogous to sum()
but returns the product of a start
value multiplied by an iterable
of numbers.
In the case of math.prod()
, the argument start
is optional and defaults to 1
. Here’s how it works:
>>> from math import prod
>>> numbers = [1, 2, 3, 4]
>>> prod(numbers)
24
This is also a big win in terms of readability and efficiency as compared to using reduce()
. So, if you’re using Python 3.8 and product reduction is a common operation in your code, then you’ll be better served by using math.prod()
rather than Python’s reduce()
.
Finding the Minimum and Maximum Value
The problem of finding the minimum and maximum value in an iterable is also a reduction problem that you can solve using Python’s reduce()
. The idea is to compare the items in the iterable to find the minimum or the maximum value.
Say you have the list of numbers [3, 5, 2, 4, 7, 1]
. In this list, the minimum value is 1
and the maximum value is 7
. To find these values, you can use a Python for
loop. Check out the following code:
>>> numbers = [3, 5, 2, 4, 7, 1]
>>> # Minimum
>>> min_value, *rest = numbers
>>> for num in rest:
... if num < min_value:
... min_value = num
...
>>> min_value
1
>>> # Maximum
>>> max_value, *rest = numbers
>>> for num in rest:
... if num > max_value:
... max_value = num
...
>>> max_value
7
Both loops iterate over the items in rest
and update the value of min_value
or max_value
according to the result of successive comparisons. Note that initially, min_value
and max_value
hold the number 3
, which is the first value in numbers
. The variable rest
holds the remaining values in numbers
. In other words, rest = [5, 2, 4, 7, 1]
.
Note: In the above examples, you use the Python iterable unpacking operator (*
) to unpack or expand the values in numbers
into two variables. In the first case, the net effect is that min_value
gets the first value in numbers
, which is 3
, and rest
collects the remaining values in a list.
Check out the details in the following examples:
>>> numbers = [3, 5, 2, 4, 7, 1]
>>> min_value, *rest = numbers
>>> min_value
3
>>> rest
[5, 2, 4, 7, 1]
>>> max_value, *rest = numbers
>>> max_value
3
>>> rest
[5, 2, 4, 7, 1]
The Python iterable unpacking operator (*
) is useful when you need to unpack a sequence or iterable into several variables.
For a better understanding of unpacking operations in Python, you can check out PEP 3132 Extended Iterable Unpacking and PEP 448 Additional Unpacking Generalizations.
Now, think about how you can find the minimum and maximum value in an iterable using Python’s reduce()
. Again, you can use a user-defined function or a lambda
function depending on your needs.
The following code implements a solution that uses two different user-defined functions. The first function will take two arguments, a
and b
, and return their minimum. The second function will use a similar process, but it’ll return the maximum value.
Here are the functions and how you can use them with Python’s reduce()
to find the minimum and maximum value in an iterable:
>>> from functools import reduce
>>> # Minimum
>>> def my_min_func(a, b):
... return a if a < b else b
...
>>> # Maximum
>>> def my_max_func(a, b):
... return a if a > b else b
...
>>> numbers = [3, 5, 2, 4, 7, 1]
>>> reduce(my_min_func, numbers)
1
>>> reduce(my_max_func, numbers)
7
When you run reduce()
with my_min_func()
and my_max_func()
, you get the minimum and maximum value in numbers
, respectively. reduce()
iterates over the items of numbers
, compares them in cumulative pairs, and finally returns the minimum or maximum value.
Note: To implement my_min_func()
and my_max_func()
, you used a Python conditional expression, or ternary operator, as a return
value. For a deeper dive into what conditional expression are and how they work, check out Conditional Statements in Python (if/elif/else).
You can also use a lambda
function to solve the minimum and maximum problem. Take a look at the following examples:
>>> from functools import reduce
>>> numbers = [3, 5, 2, 4, 7, 1]
>>> # Minimum
>>> reduce(lambda a, b: a if a < b else b, numbers)
1
>>> # Maximum
>>> reduce(lambda a, b: a if a > b else b, numbers)
7
This time, you use two lambda
functions that find out if a
is either less than or greater than b
. In this case, Python’s reduce()
applies the lambda
function to each value in numbers
, comparing it with the result of the previous computation. At the end of the process, you get the minimum or maximum value.
The minimum and maximum problem is so common in programming that Python has added built-in functions to perform these reductions. These functions are conveniently called min()
and max()
, and you don’t need to import anything to be able to use them. Here’s how they work:
>>> numbers = [3, 5, 2, 4, 7, 1]
>>> min(numbers)
1
>>> max(numbers)
7
When you use min()
and max()
to find the minimum and maximum item in an iterable, your code is way more readable as compared to using Python’s reduce()
. Additionally, since min()
and max()
are highly-optimized C functions, you can also say that your code will be more efficient.
So, when it comes to solving this problem in Python, it’s best to use min()
and max()
rather than reduce()
.
Checking if All Values Are True
The all-true use case of Python’s reduce()
involves finding out whether or not all the items in an iterable are true. To solve this problem, you can use reduce()
along with a user-defined function or a lambda
function.
You’ll start by coding a for
loop to find out if all the items in an iterable are true. Here’s the code:
>>> def check_all_true(iterable):
... for item in iterable:
... if not item:
... return False
... return True
...
>>> check_all_true([1, 1, 1, 1, 1])
True
>>> check_all_true([1, 1, 1, 1, 0])
False
>>> check_all_true([])
True
If all of the values in iterable
are true, then check_all_true()
returns True
. Otherwise, it returns False
. It also returns True
with empty iterables. check_all_true()
implements a short-circuit evaluation. This means that the function returns as soon as it finds a false value without processing the rest of the items in iterable
.
To solve this problem using Python’s reduce()
, you’ll need to write a function that takes two arguments and returns True
if both arguments are true. If one or both arguments are false, then the function will return False
. Here’s the code:
>>> def both_true(a, b):
... return bool(a and b)
...
>>> both_true(1, 1)
True
>>> both_true(1, 0)
False
>>> both_true(0, 0)
False
This function takes two arguments, a
and b
. Then you use the and
operator to test if both arguments are true. The return value will be True
if both arguments are true. Otherwise, it’ll be False
.
In Python, the following objects are considered false:
- Constants like
None
andFalse
- Numeric types with a zero value like
0
,0.0
,0j
,Decimal(0)
, andFraction(0, 1)
- Empty sequences and collections like
""
,()
,[]
,{}
,set()
, andrange(0)
- Objects that implement
__bool__()
with a return value ofFalse
or__len__()
with a return value of0
Any other object will be considered true.
You need to use bool()
to convert the return value of and
into either True
or False
. If you don’t use bool()
, then your function won’t behave as expected because and
returns one of the objects in the expression instead of True
or False
. Check out the following examples:
>>> a = 0
>>> b = 1
>>> a and b
0
>>> a = 1
>>> b = 2
>>> a and b
2
and
returns the first value in the expression if it’s false. Otherwise, it returns the last value in the expression regardless of its truth value. That’s why you need to use bool()
in this case. bool()
returns the Boolean value (True
or False
) resulting from evaluating a Boolean expression or an object. Check out the examples using bool()
:
>>> a = 0
>>> b = 1
>>> bool(a and b)
False
>>> a = 1
>>> b = 2
>>> bool(a and b)
True
bool()
will always return either True
or False
after evaluating the expression or object at hand.
Note: To better understand Python operators and expressions, you can check out Operators and Expressions in Python.
You can pass both_true()
to reduce()
to check if all the items of an iterable are true or not. Here’s how this works:
>>> from functools import reduce
>>> reduce(both_true, [1, 1, 1, 1, 1])
True
>>> reduce(both_true, [1, 1, 1, 1, 0])
False
>>> reduce(both_true, [], True)
True
If you pass both_true()
as an argument to reduce()
, then you’ll get True
if all of the items in the iterable are true. Otherwise, you’ll get False
.
In the third example, you pass True
to the initializer
of reduce()
to get the same behavior as check_all_true()
and to avoid a TypeError
.
You can also use a lambda
function to solve the all-true use case of reduce()
. Here are some examples:
>>> from functools import reduce
>>> reduce(lambda a, b: bool(a and b), [0, 0, 1, 0, 0])
False
>>> reduce(lambda a, b: bool(a and b), [1, 1, 1, 2, 1])
True
>>> reduce(lambda a, b: bool(a and b), [], True)
True
This lambda
function is quite similar to both_true()
and uses the same expression as a return value. It returns True
if both arguments are true. Otherwise, it returns False
.
Note that unlike check_all_true()
, when you use reduce()
to solve the all-true use case, there’s no short-circuit evaluation because reduce()
doesn’t return until it traverses the entire iterable. This can add extra processing time to your code.
For example, say you have the list lst = [1, 0, 2, 0, 0, 1]
and you need to check if all the items in lst
are true. In this case, check_all_true()
will finish as soon as its loop processes the first pair of items (1
and 0
) because 0
is false. You don’t need to continue iterating because you already have an answer for the problem at hand.
On the other hand, the reduce()
solution won’t finish until it processes all the items in lst
. That’s five iterations later. Now imagine what this would do to the performance of your code if you were processing a large iterable!
Fortunately, Python provides the right tool for solving the all-true problem in a Pythonic, readable, and efficient way: the built-in function all()
.
You can use all(iterable)
to check if all of the items in iterable
are true. Here’s how all()
works:
>>> all([1, 1, 1, 1, 1])
True
>>> all([1, 1, 1, 0, 1])
False
>>> all([])
True
all()
loops over the items in an iterable, checking the truth value of each of them. If all()
finds a false item, then it returns False
. Otherwise, it returns True
. If you call all()
with an empty iterable, then you get True
because there’s no false item in an empty iterable.
all()
is a C function that’s optimized for performance. This function is also implemented using short-circuit evaluation. So, if you’re dealing with the all-true problem in Python, then you should consider using all()
instead of reduce()
.
Checking if Any Value Is True
Another common use case for Python’s reduce()
is the any-true use case. This time, you need to find out if at least one item in an iterable is true. To solve this problem, you need to write a function that takes an iterable and returns True
if any item in the iterable is true and False
otherwise. Take a look at the following implementation for this function:
>>> def check_any_true(iterable):
... for item in iterable:
... if item:
... return True
... return False
...
>>> check_any_true([0, 0, 0, 1, 0])
True
>>> check_any_true([0, 0, 0, 0, 0])
False
>>> check_any_true([])
False
If at least one item in iterable
is true, then check_any_true()
returns True
. It returns False
only if all the items are false or if the iterable is empty. This function also implements a short-circuit evaluation because it returns as soon as it finds a true value, if any.
To solve this problem using Python’s reduce()
, you need to code a function that takes two arguments and returns True
if at least one of them is true. If both are false, then the function should return False
.
Here’s a possible implementation for this function:
>>> def any_true(a, b):
... return bool(a or b)
...
>>> any_true(1, 0)
True
>>> any_true(0, 1)
True
>>> any_true(0, 0)
False
any_true()
returns True
if at least one of its arguments it true. If both arguments are false, then any_true()
returns False
. As with both_true()
in the above section, any_true()
uses bool()
to convert the result of the expression a or b
to either True
or False
.
The Python or
operator works a little differently from and
. It returns the first true object or the last object in the expression. Check out the following examples:
>>> a = 1
>>> b = 2
>>> a or b
1
>>> a = 0
>>> b = 1
>>> a or b
1
>>> a = 0
>>> b = []
>>> a or b
[]
The Python or
operator returns the first true object or, if both are false, the last object. So, you also need to use bool()
to get a coherent return value from any_true()
.
Once you have this function in place, you can continue with the reduction. Take a look at the following calls to reduce()
:
>>> from functools import reduce
>>> reduce(any_true, [0, 0, 0, 1, 0])
True
>>> reduce(any_true, [0, 0, 0, 0, 0])
False
>>> reduce(any_true, [], False)
False
You’ve solved the problem using Python’s reduce()
. Note that in the third example, you pass False
to the initializer of reduce()
to reproduce behavior of the original check_any_true()
and also to avoid a TypeError
.
Note: Like the examples in the previous section, these examples of reduce()
don’t make a short-circuit evaluation. That means they can affect the performance of your code.
You can also use a lambda
function with reduce()
to solve the any-true use case. Here’s how you can do it:
>>> from functools import reduce
>>> reduce(lambda a, b: bool(a or b), [0, 0, 1, 1, 0])
True
>>> reduce(lambda a, b: bool(a or b), [0, 0, 0, 0, 0])
False
>>> reduce(lambda a, b: bool(a or b), [], False)
False
This lambda
function is quite similar to any_true()
. It returns True
if either of its two arguments is true. If both arguments are false, then it returns False
.
Even though this solution takes only one line of code, it can still make your code unreadable or at least difficult to understand. Again, Python provides a tool to efficiently solve the any-true problem without using reduce()
: the built-in function any()
.
any(iterable)
loops over the items in iterable
, testing the truth value of each until it finds a true item. The function returns True
as soon as it finds a true value. If any()
doesn’t find a true value, then it returns False
. Here’s an example:
>>> any([0, 0, 0, 0, 0])
False
>>> any([0, 0, 0, 1, 0])
True
>>> any([])
False
Again, you don’t need to import any()
to use it in your code. any()
works as expected. It returns False
if all the items in the iterable are false. Otherwise, it returns True
. Note that if you call any()
with an empty iterable, then you get False
because there’s no true item in an empty iterable.
As with all()
, any()
is a C function optimized for performance. It’s also implemented using short-circuit evaluation. So, if you’re dealing with the any-true problem in Python, then consider using any()
instead of reduce()
.
Comparing reduce()
and accumulate()
A Python function called accumulate()
lives in itertools
and behaves similarly to reduce()
. accumulate(iterable[, func])
accepts one required argument, iterable
, which can be any Python iterable. The optional second argument, func
, needs to be a function (or a callable object) that takes two arguments and returns a single value.
accumulate()
returns an iterator. Each item in this iterator will be the accumulated result of the computation that func
performs. The default computation is the sum. If you don’t supply a function to accumulate()
, then each item in the resulting iterator will be the accumulated sum of the previous items in iterable
plus the item at hand.
Check out the following examples:
>>> from itertools import accumulate
>>> from operator import add
>>> from functools import reduce
>>> numbers = [1, 2, 3, 4]
>>> list(accumulate(numbers))
[1, 3, 6, 10]
>>> reduce(add, numbers)
10
Note that the last value in the resulting iterator is the same value that reduce()
returns. This is the main similarity between these two functions.
Note: Since accumulate()
returns an iterator, you need to call list()
to consume the iterator and get a list object as an output.
If, on the other hand, you supply a two-argument function (or callable) to the func
argument of accumulate()
, then the items in the resulting iterator will be the accumulated result of the computation performed by func
. Here’s an example that uses operator.mul()
:
>>> from itertools import accumulate
>>> from operator import mul
>>> from functools import reduce
>>> numbers = [1, 2, 3, 4]
>>> list(accumulate(numbers, mul))
[1, 2, 6, 24]
>>> reduce(mul, numbers)
24
In this example, you can again see that the last item in the returned value of accumulate()
is equal to the value returned by reduce()
.
Considering Performance and Readability
Python’s reduce()
can have remarkably bad performance because it works by calling functions multiple times. This can make your code slow and inefficient. Using reduce()
can also compromise the readability of your code when you use it with complex user-defined functions or lambda
functions.
Throughout this tutorial, you’ve learned that Python offers a bunch of tools that can gracefully replace reduce()
, at least for its main use cases. Here are the main takeaways of your reading up to this point:
-
Use a dedicated function to solve use cases for Python’s
reduce()
whenever possible. Functions such assum()
,all()
,any()
,max()
,min()
,len()
,math.prod()
, and so on will make your code faster and more readable, maintainable, and Pythonic. -
Avoid complex user-defined functions when using
reduce()
. These kinds of functions can make your code difficult to read and understand. You can use an explicit and readablefor
loop instead. -
Avoid complex
lambda
functions when usingreduce()
. They can also make your code unreadable and confusing.
The second and third points were concerns for Guido himself when he said the following:
So now
reduce()
. This is actually the one I’ve always hated most, because, apart from a few examples involving+
or*
, almost every time I see areduce()
call with a non-trivial function argument, I need to grab pen and paper to diagram what’s actually being fed into that function before I understand what thereduce()
is supposed to do. So in my mind, the applicability ofreduce()
is pretty much limited to associative operators, and in all other cases it’s better to write out the accumulation loop explicitly. (Source)
The next two sections will help you implement this general advice in your code. They also provide some extra advice that will help you use Python’s reduce()
effectively when you really need to use it.
Performance Is Key
If you’re going to use reduce()
to solve the use cases that you’ve covered in this tutorial, then your code will be considerably slower as compared to code using dedicated built-in functions. In the following examples, you’ll use timeit.timeit()
to quickly measure the execution time of small bits of Python code and get an idea of their general performance.
timeit()
takes several arguments, but for these examples, you’ll only need to use the following:
stmt
holds the statement that you need to time.setup
takes additional statements for general setup, likeimport
statements.globals
holds a dictionary containing the global namespace that you need to use for runningstmt
.
Take a look at the following examples that time the sum use case using reduce()
with different tools and using Python’s sum()
for comparison purposes:
>>> from functools import reduce
>>> from timeit import timeit
>>> # Using a user-defined function
>>> def add(a, b):
... return a + b
...
>>> use_add = "functools.reduce(add, range(100))"
>>> timeit(use_add, "import functools", globals={"add": add})
13.443158069014316
>>> # Using a lambda expression
>>> use_lambda = "functools.reduce(lambda x, y: x + y, range(100))"
>>> timeit(use_lambda, "import functools")
11.998800784000196
>>> # Using operator.add()
>>> use_operator_add = "functools.reduce(operator.add, range(100))"
>>> timeit(use_operator_add, "import functools, operator")
5.183870767941698
>>> # Using sum()
>>> timeit("sum(range(100))", globals={"sum": sum})
1.1643308430211619
Even though you’ll get different numbers depending on your hardware, you’ll likely get the best time measurement using sum()
. This built-in function is also the most readable and Pythonic solution for the sum problem.
Note: For more a detailed approach to how to time your code, check out Python Timer Functions: Three Ways to Monitor Your Code.
Your second-best option would be to use reduce()
with operator.add()
. The functions in operator
are written in C and are highly optimized for performance. So, they should perform better than a user-defined function, a lambda
function, or a for
loop.
Readability Counts
Code readability is also an important concern when it comes to using Python’s reduce()
. Even though reduce()
will generally perform better than a Python for
loop, as Guido himself stated, a clean and Pythonic loop is often easier to follow than using reduce()
.
The What’s New In Python 3.0 guide reinforces this idea when it says the following:
Use
functools.reduce()
if you really need it; however, 99 percent of the time an explicitfor
loop is more readable. (Source)
To better understand the importance of readability, imagine that you’re starting to learn Python and you’re trying to solve an exercise about calculating the sum of all the even numbers in an iterable.
If you already know about Python’s reduce()
and have done some functional programming in the past, then you might come up with the following solution:
>>> from functools import reduce
>>> def sum_even(it):
... return reduce(lambda x, y: x + y if not y % 2 else x, it, 0)
...
>>> sum_even([1, 2, 3, 4])
6
In this function, you use reduce()
to cumulatively sum the even numbers in an iterable. The lambda
function takes two arguments, x
and y
, and returns their sum if they’re even. Otherwise, it returns x
, which holds the result of the previous sum.
Additionally, you set initializer
to 0
because otherwise your sum will have an initial value of 1
(the first value in iterable
), which isn’t an even number and will introduce a bug into your function.
The function works as you expected, and you’re happy with the result. However, you continue digging into Python and learn about sum()
and generator expressions. You decide to rework your function using these new tools, and your function now looks as follows:
>>> def sum_even(iterable):
... return sum(num for num in iterable if not num % 2)
...
>>> sum_even([1, 2, 3, 4])
6
When you look at this code, you feel really proud, and you should. You’ve done a great job! That’s a beautiful Python function that almost reads as plain English. It’s also efficient and Pythonic. What do you think?
Conclusion
Python’s reduce()
allows you to perform reduction operations on iterables using Python callables and lambda
functions. reduce()
applies a function to the items in an iterable and reduces them to a single cumulative value.
In this tutorial, you’ve learned:
- What reduction, or folding, is and when it might be useful
- How to use Python’s
reduce()
to solve common reduction problems like summing or multiplying numbers - Which Pythonic tools you can use to effectively replace
reduce()
in your code
With this knowledge, you’ll be able to decide which tools best fit your coding needs when it comes to solving reduction problems in Python.
Over the years, reduce()
has been replaced by more Pythonic tools like sum()
, min()
, max()
all()
, any()
, among others. However, reduce()
is still there and is still popular among functional programmers. If you have questions or thoughts about using reduce()
or any of its Python alternatives, then be sure to post them in the comments below.