Going Deeper on the Iterator Protocol

Efficient Iterations With Python Iterators and Iterables Christopher Trudeau 07:26

00:00 In the previous lesson, I covered a few bits and pieces on iterators not covered elsewhere. In this lesson, I’m returning to iterables to dive in a little deeper. Just a quick recap, an iterable is anything that implements __iter__.

00:16 __iter__ is supposed to return an iterator. For a pure iterable, like a list, a new iterator gets created for an iterator object. The return value is the iterator object itself, meaning the iterator also implements the iterable protocol.

00:33 Throughout the course, you’ve seen several different ways to create iterators. I showed you how to write an iterator class with the cat and dog iterators and then showed you how to write generators with yield.

00:44 Now that you understand these different methods, there are actually faster ways of embedding an iterator right inside your __iter__ method. For example, if your class uses a list as its internal data structure, then your __iter__ merely needs to call iter() on that iterable to get the list iterator.

01:02 You can also write your __iter__ as a generator itself or return a generator expression from the __iter__. And there’s one more way as well that I haven’t shown you yet, and that’s the yield from keywords.

01:17 These are a shortcut for creating a generator on an interval. Let’s look at some code and contrast these three methods.

01:26 I’m gonna show you three different versions of a class that implements a stack. The only difference between them is how the __iter__ gets implemented.

01:34 The class has four methods. The first is a __init__ that creates a list that stores the stuff in the stack. Then the push and pop methods are the public interface for putting things in the stack or taking them off the top.

01:47 And to make our stack iterable the fourth method implements __iter__. In stack one, I’m using the iter() function. Remember, iter() on a list returns the list iterator.

02:00 As my internal data structure for the stack is just a list, I can implement __iter__ by returning the result of the iter()function.

02:08 I don’t have to do anything else. I don’t have to create a special iterator class just call iter() directly.

02:18 This is version two. Everything is the same except for the approach to __iter__. Here, I’ve turned this method into a generator by looping over the items and yielding each value.

02:29 This actually takes more lines of code than the previous example, but if your storage is more complex than a list, then using the generator technique might be a better approach.

02:42 And finally, this is version three. Here I’m using the yield from keywords. What yield from does is create a yielding loop on an iterable.

02:52 This single line is the equivalent to the two lines I showed you in stack two. Again, if your internal data structure is an interable, then this is a nice, neat single line approach.

03:04 This doesn’t just apply to __iter__ methods. You could go back to our generator examples and replace the loops that have yield inside with a single yield from line.

03:16 Many of Python’s built-in interables are also sequences. A sequence is an object that allows you to slice and index to retrieve values. For example, a list where you can get it a specific value with square brackets or a string where you can access an individual character.

03:33 Sequences are also implemented using a protocol. This protocol has two methods __len__(), which I showed you a few lessons back, should return the length of the sequence and __getitem__, which takes an index number and is responsible for returning the value in the sequence at that given index.

03:51 Having this also means slicing is supported. Python figures out how to slice and calls __getitem__ for each of the values in the slice. You can tell that __getitem__ has been around for a while.

04:02 Some of the earlier Python functions and methods don’t adhere closely to the naming conventions. This gets in my way all the time, and I always wanna put an underscore between get and item, but there isn’t one.

04:14 Interestingly, if you’ve implemented the sequence protocol, you don’t have to implement the iterable protocol. Python translates it for you. Let’s go look at yet one more stack.

04:26 Stack four has no __iter__ method. Instead, it has __len__, which returns the length of the internal data structure used to store the stack and __getitem__, which gives index-based access to values in the stack.

04:41 Let’s try this out.

04:47 Imported, created a stack

04:53 pushing on some values,

04:56 and then this next bit’s a little weird. You really shouldn’t allow indexed access into a stack kind of defeats the purpose of a stack, but that’s me just being a purist.

05:09 My discomfort aside, __getitem__ means I can get at individual values with square brackets.

05:17 Having implemented __len__ means I can ask the stack’s length, and even though this class doesn’t implement __iter__, I can iterate on it.

05:31 By making my stack a sequence Python makes it an iterable for me.

05:38 Python has multiple ways of creating asynchronous and concurrent code. One of those mechanisms is through the use of the async and await keywords to write concurrent co-routines.

05:49 This is often done in conjunction with the asyncio module from the standard library that provides concurrency tools. Concurrency is a big topic on its own and is beyond the scope of this course.

06:00 The quick version is there are times in computing where your program is blocked waiting for output. A co-routine gives you a way of switching to some other calculation while waiting, giving the appearance of doing two things at a time.

06:13 To do this, you have to signal to Python that your functions are going to behave this way, and you use the async and awake keywords to control what functions can do this and when the task switching can happen.

06:24 If you’re iterating through an iterable within the co-routine world, you want to use the asynchronous version of the iterator protocol so that your iteration doesn’t block the task swapping. Protocols similar to its synchronous cousin, just with an asynchronous flavor. you need a __aiter__, and a __anext__.

06:42 When your __anext__ is finished, it raises a StopAsyncIteration exception instead. The __anext__ must itself be an asynchronous function, which means the values returned from it must be await-able.

07:00 If you’re not familiar with async and await, some of that may have sounded a little overwhelming as this is an iterable course your takeaway should be only that there is an asynchronous variation on iteration.

07:11 If you start writing concurrent programs, you can take advantage of this feature. Until then, don’t about it.

07:18 Well that’s the penultimate lesson.

Become a Member to join the conversation.