Python Generators 101 (Overview)

Have you ever had to work with a dataset so large that it overwhelmed your machine’s memory? Or maybe you have a complex function that needs to maintain an internal state every time it’s called, but the function is too small to justify creating its own class. In these cases and more, generators and the Python yield statement are here to help.

By the end of this course, you’ll know:

  • What generators are and how to use them
  • How to create generator functions and expressions
  • How the Python yield statement works
  • How to use multiple Python yield statements in a generator function
  • How to use advanced generator methods
  • How to build data pipelines with multiple generators

If you’re a beginner or intermediate Pythonista and you’re interested in learning how to work with large datasets in a more Pythonic way, then this is the course for you.


Sample Code (.zip)

25.2 KB

Course Slides (.pdf)

7.3 MB

00:00 Hello, and welcome to this course on generators and the yield keyword in Python. Before I show you any code, I’d like to give you a general intuition of what generators are and why you would want to use them. So, first, consider a staircase.

00:14 You have a sequence of steps, one by one, right? And the way stairs are designed to be used, or intended to be used, are that you put one foot in front of the other and you go one step at a time, one by one, and then you go up the stairs. But there is an alternative, and I’m sure you’ve tried this as well, and that’s to take several steps at a time.

00:36 So depending on how long your legs are, you might be able to go two steps at a time, three steps at a time, maybe even as many as four steps at a time. But there’s a limit to this, and that limit is really the length of your legs.

00:47 You might be able to go up a staircase with two, three, four, maybe even five steps in one go, but there are staircases out there who are just too long, who completely exceed the length of your legs. And then you have no choice but to take them in smaller chunks. That might not be a single step at a time, but it’ll be a few steps at a time, then. You can’t just do this in one go.

01:09 But because you’re able to break down the task of going up the stairs into smaller, manageable chunks, then you’re able to go up taller stairs, and so a high number of steps isn’t an obstacle which you can’t overcome.

01:22 It might be more challenging, but you can do it. Something similar happens in Python with many structures which are stored in memory, because memory is a limited resource.

01:32 Consider the example of lists. So, a list takes up a bit of memory but it’s an iterable structure, so once it’s in memory, you can iterate over the items, one by one. In this case, you would return 1, then 2, then 3. Here’s a slightly longer list, and so, since it’s longer, it takes up more memory.

01:51 But still, this is something which most computers can handle and you could iterate over the items on this list: 1, 2, 3, 4, and so on, all the way up to 14. In the case of an even-longer list, even more memory is taken up. So, I think you can see where I’m going with this.

02:06 Eventually, you will reach a point where your computer’s memory just can’t handle this. Lists just become too big. And if you’re working with big data or maybe you’re working on an application that uses infinite sequences, then since your computer’s memory is, by definition, limited, and not infinite, then you need a different way to approach this problem.

02:26 And that’s where generators come in. They allow us to attack these very large problems one small manageable chunk at a time. In this course, you are going to learn what generators are and how to use them.

02:39 We’re going to be looking at the syntax and, sort of, the rules that you have to keep in mind when you’re using generators. We’ll also keep in mind what tradeoffs you have to consider. So, as I hinted at a moment ago, they can save a lot of memory, but you do pay a price for that in terms of speed.

02:55 So there’s a tradeoff between memory footprint and speed. You’ll learn how to use the yield statement, which is a key component of generators.

03:03 You’ll learn how to use multiple yield statements in a single generator function. And then to wrap up, we’ll look at some advanced generator methods and I’ll show you how to build data pipelines using multiple generators.

03:14 This content is broken up into five videos. Right now, we’re on the first one, the introduction. In the next video, I’ll tell you about generators, how to implement them, what syntax rules you have to obey, and so on.

03:25 I’ll see you there!

Avatar image for William

William on June 18, 2020

Please, I cannot play the videos, I am now in China but not sure whether that’s the trouble.

Avatar image for Ricky White

Ricky White RP Team on June 18, 2020

Hi William. We use Vimeo to serve our videos, so if that service is blocked in your region, I’m afraid they won’t work without a VPN. But I know VPNs are also illegal in China, so I would not like to advocate you do anything illegal. But in theory it should work.

Become a Member to join the conversation.