Why Use the filter() Function?

Functional Programming in Python Dan Bader 03:46

In this lesson, you’ll see why you’d want to use the filter() function rather than, for example, a for loop with an if statement. You’d get the same result, but functional programming allows you to chain function calls. This allows you to avoid side effects and have a line of code that gives you a quick snapshot of what is happening.

When doing functional programming, you have a bunch of functions as your basic building blocks, and then you can use and reuse them together in different contexts.

00:00 And, why do it this way? Right? Like, why should you write code like that? Why not do it differently? Like, why not do for x in scientists and then if you have x.nobel is True, then print(x).

00:22 Same result. Why not do a for loop? Well, what’s kind of nice here is that you can chain these things very well and you can apply these transformations in a purely functional way.

00:36 When you look at this expression here, this is just calling a bunch of functions. We’re not iterating over a list, we’re not having these side effects, we’re printing stuff out. And in some ways, of course, it seems more complicated than having this simple loop here, but what’s really nice is how declarative this is, right? We’re just saying,

00:56 “Hey! Okay, we’ll take this list of scientists, apply this filter() function, whatever it is, make sure we have a list at the end of it, and then print it out.” It’s like this long chain of function calls and that allows us to do this transformation in a single line, here.

01:11 And it’s not really about like the number of lines of code, but you know, we’re doing this with these very, very simple composable things here. We’re using the pprint() function, we’re using the tuple() function, we’re using this filter() function that could be reused.

01:24 And these are all little building blocks that I could just reuse in different contexts. If you imagine I did this, let’s call this nobel_filter(), it takes an x, and then it returns x.nobel is True.

01:43 All right, so now, I’ve got this nobel_filter() and now we can go in and we can say, okay, filter

01:50 on this nobel_filter().

01:55 And then again, we want the tuple(), and just for clarity’s sake, we also want to pprint(). And now I have a reusable piece, right?

02:06 I have a reusable piece. I have this nobel_filter(). I could define a bunch of other filters that can just be applied without me

02:14 having to rewrite this code or copy and pasting a bunch of codes. So, now everything we use here, it’s just a series of function applications. And it becomes clearer when I do it this way than with the lambda, I guess, because the lambda is kind of ugly and a lot of people in Python, they don’t really like it. And by the way, you know, before people leave angry comments, I’m not necessarily encouraging this style of programming, because it’s not really what Python was meant to do.

02:42 Arguably, this would be a better way to do it, but it can be fun to think about how you can use a functional programming style that relies on application or evaluating functions rather than writing stuff like this, and that mainly relies on immutable data structures, how that can change your mindset a little bit, and it can make things a lot easier if you’re working in parallel programming, for example, because I could apply this filter in parallel and it would be very easy to parallelize that across several threads or processes. Actually, it could be an interesting exercise.

03:20 Maybe I’ll get to that as part of these tutorials to actually show you how to do that. And if you write your code in a more functional programming style, you can take advantage of these things. All right, so I hope that gave you a good idea of how the filter() function works. Now, I can’t really stop this part of the tutorial here without talking about list comprehensions. Or I guess, in this case, we would create a tuple.

matt7 on Feb. 20, 2020

You give some very good arguments why the filter style is good in this case. It would help if you could also have a quick discussion on the performance and memory usage impacts of this style of using filter vs. loops/if.

Lipsa on Feb. 28, 2020

where can i find the notebook for this series?

Ricky White RP Team on Feb. 28, 2020

Hi Lipsa. We don’t have notebooks for our series. Sorry.

sandeepranjan on March 21, 2020

I am getting an error when using the filter function:

mathS = tuple(filter(lambda x : x.field == 'math', scientists))
pprint(mathS)

Throws below error. I do have the necessary imports and rest of the code is identical to what is being shown in this tutorial -

Any Idea what am I missing ?

Traceback (most recent call last):
  File "/Users/sandeep/Library/Preferences/PyCharmCE2019.3/scratches/scratch.py", line 23, in <module>
    mathS = tuple(filter(lambda x : x.field == 'math', scientists))
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/fnmatch.py", line 51, in filter
    pat = os.path.normcase(pat)
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/posixpath.py", line 54, in normcase
    s = os.fspath(s)
TypeError: expected str, bytes or os.PathLike object, not tuple

Dan Bader RP Team on March 22, 2020

@sandeepranjan: It’s odd that you’re seeing an exception somewhere in the os module (posixpath.py and fnmatch.py in the Traceback)… I don’t see anything wrong with your filter expression upon first inspection.

I’m wondering if this is an issue with your local Python environment, maybe you could try your program on a different machine or inside a Docker container with a clean CPython install to see if the error goes away.

jaman on March 23, 2020

Hi Dan, thanks for the great videos. I want to understand your thought process about when/if data should be stored in a pandas dataframe instead of storing data in basic types like tuples, lists, dictionaries, etc. My impression is that data analysis/viz/modeling work uses pandas, but I’m wondering what you consider while deciding if data should belong in a pandas df.

Dan Bader RP Team on March 23, 2020

IMO the main benefits of using Pandas (which is built on top of NumPy) over a homegrown solution are:

Ease of use and access to common functionality (eg. loading & reshaping data)
Performance benefits due to vectorization

Nanogines on March 27, 2020

What editor do you use?

Rashikraj Shrestha on March 28, 2020

# we can reuse filter function with a reusable function
def nobel_filter(x):
    return x.nobel is True

tuple(filter(lambda x: nobel_filter, Scientists))

(Scientist(name='Ada Lovelace', field='math', born=1815, nobel=False),
 Scientist(name='Emmy Noether', field='math', born=1882, nobel=False),
 Scientist(name='Marie Curie', field='math', born=1867, nobel=True),
 Scientist(name='Tu Youyou', field='physics', born=1930, nobel=True),
 Scientist(name='Ada Yonath', field='chemistry', born=1939, nobel=True),
 Scientist(name='Vera Rubin', field='chemistry', born=1928, nobel=False),
 Scientist(name='Sally Ride', field='physics', born=1951, nobel=False))

what is the issue over here?? i could not exploit it

Sagar Rathod on March 30, 2020

Hi @Rashikraj Shrestha, You should have called that function nobel_filter(x) inside lambda insted of just nobel_filter.

Thanks Dan, for Nice Explanation.

Rashikraj Shrestha on March 31, 2020

@Sagar Rathod, but in this tutorial it is clearly instructing to use nobel_filter only

Victor R Cardoso on March 31, 2020

@Rashikaj, you added the extra lambda expression:

tuple(filter(lambda x: nobel_filter, Scientists))

This function will return everything as True, given that for each case lambda x will return the function nobel_filter (without calling it!).

You’ll need to remove the lambda to use the function:

tuple(filter(nobel_filter, Scientists))

In this case it will call the function for each case.

Otherwise, if you want to keep the lambda you’ll need to make it calling the x case explicitly:

tuple(filter(lambda x: nobel_filter(x), Scientists))

I have not checked, but this should work fine.

Thank you for the discussion.

Dan Bader RP Team on April 1, 2020

Yes that’s correct, good explanation Victor :)

Nikhil Omkar on Jan. 2, 2021

Does functional programming reduce space and time complexities?

Bartosz Zaczyński RP Team on Jan. 4, 2021

@Nikhil Omkar That’s an interesting question!

In terms of space, I’d argue that FP is more likely to be memory demanding compared to other programming paradigms. While FP was invented in the ’50s, it wasn’t adopted until recently because of the high memory cost at the time. On the one hand, functions can share immutable data, but even the slightest mutation results in making lots of copies. Moreover, all those function calls need to be stored on a stack before they can be processed. Unless the language comes with a few smart tricks like tail call optimization, the memory consumption will most likely be higher than in a non-functional code.

Regarding speed, another reason why FP is getting noticed these days are multi-core CPUs. The time for Moore’s law has long gone. CPU vendors have been investing in stuffing more cores onto a chip rather than making the individual cores faster. This means programmers need to learn to take advantage of parallel computing, and functional programming helps with that.

Anyway, the choice of a particular programming paradigm will be much less important from the time and space complexity perspective than the algorithm’s choice.

Jon David on Nov. 19, 2021

I <3 lambdas

Become a Member to join the conversation.