Asynchronous Iterators and Iterables in Python

Asynchronous Iterators and Iterables in Python

by Leodanis Pozo Ramos Aug 07, 2024 advanced python

When you write asynchronous code in Python, you’ll likely need to create asynchronous iterators and iterables at some point. Asynchronous iterators are what Python uses to control async for loops, while asynchronous iterables are objects that you can iterate over using async for loops.

Both tools allow you to iterate over awaitable objects without blocking your code. This way, you can perform different tasks asynchronously.

In this tutorial, you’ll:

  • Learn what async iterators and iterables are in Python
  • Create async generator expressions and generator iterators
  • Code async iterators and iterables with the .__aiter__() and .__anext__() methods
  • Use async iterators in async loops and comprehensions

To get the most out of this tutorial, you should know the basics of Python’s iterators and iterables. You should also know about Python’s asynchronous features and tools.

Take the Quiz: Test your knowledge with our interactive “Asynchronous Iterators and Iterables in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Asynchronous Iterators and Iterables in Python

Take this quiz to test your understanding of how to create and use Python async iterators and iterables in the context of asynchronous code.

Getting to Know Async Iterators and Iterables in Python

Iterators and iterables are fundamental components in Python. You’ll use them in almost all your programs where you iterate over data streams using a for loop. Iterators power and control the iteration process, while iterables typically hold data that you want to iterate over.

Python iterators implement the iterator design pattern, which allows you to traverse a container and access its elements. To implement this pattern, iterators need the .__iter__() and .__next__() special methods. Similarly, iterables are typically data containers that implement the .__iter__() method.

Python has extended the concept of iterators and iterables to asynchronous programming with the asyncio module and the async and await keywords. In this scenario, asynchronous iterators drive the asynchronous iteration process, mainly powered by async for loops and comprehensions.

In the following sections, you’ll briefly examine the concepts of asynchronous iterators and iterables in Python.

Async Iterators

Python’s documentation defines asynchronous iterators, or async iterators for short, as the following:

An object that implements the .__aiter__() and .__anext__() [special] methods. .__anext__() must return an awaitable object. [An] async for [loop] resolves the awaitables returned by an asynchronous iterator’s .__anext__() method until it raises a StopAsyncIteration exception. (Source)

Similar to regular iterators that must implement .__iter__() and .__next__(), async iterators must implement .__aiter__() and .__anext__(). In regular iterators, the .__iter__() method usually returns the iterator itself. This is also true for async iterators.

To continue with this parallelism, in regular iterators, the .__next__() method must return the next object for the iteration. In async iterators, the .__anext__() method must return the next object, which must be awaitable.

Python defines awaitable objects as described in the quote below:

An object that can be used in an await expression. [It] can be a coroutine or an object with an .__await__() method. (Source)

In practice, a quick way to make an awaitable object in Python is to call an asynchronous function. You define this type of function with the async def keyword construct. This call creates a coroutine object.

When the data stream runs out of data, the method must raise a StopAsyncIteration exception to end the asynchronous iteration process.

Here’s an example of an async iterator that allows iterating over a range of numbers asynchronously:

Python async_range_v1.py
import asyncio

class AsyncRange:
    def __init__(self, start, end):
        self.start = start
        self.end = end

    def __aiter__(self):
        return self

    async def __anext__(self):
        if self.start < self.end:
            await asyncio.sleep(0.5)
            value = self.start
            self.start += 1
            return value
        else:
            raise StopAsyncIteration

async def main():
    async for i in AsyncRange(0, 5):
        print(i)

asyncio.run(main())

In the .__aiter__() method, you return self, which is the current object—the iterator itself. In the .__anext__() method, you generate a number from the range between .start and .end.

To simulate the required awaitable object, you use the asyncio.sleep() function with a delay of 0.5 seconds in an await statement. When the range is covered, you raise a StopAsyncIteration exception to finish the iteration.

When you run the script, you get the following output:

Shell
$ python async_range_v1.py
0
1
2
3
4

In this example, you’ll get each number after waiting half a second, which is congruent with the asynchronous iteration.

The above example is a quick first look at async iterators and how to define them. You’ll learn more about the .__aiter__() and .__anext__() methods and related concepts when you get to the sections on creating async iterators. Now, it’s time to learn the basics of async iterables.

Async Iterables

When it comes to async iterables, the Python documentation says the following:

An object, that can be used in an async for statement. Must return an asynchronous iterator from its .__aiter__() method. (Source)

In practice, an object only needs an .__aiter__() method that returns an async iterator to be iterable. It doesn’t need the .__anext__() method.

Here’s how you’d modify the AsyncRange to be an async iterable rather than an iterator:

Python async_range_v2.py
import asyncio

class AsyncRange:
    def __init__(self, start, end):
        self.data = range(start, end)

    async def __aiter__(self):
        for i in self.data:
            await asyncio.sleep(0.5)
            yield i

async def main():
    async for i in AsyncRange(0, 10):
        print(i)

asyncio.run(main())

This new implementation of your AsyncRange class is more concise than the previous one. It just has the .__aiter__() method, which yields numbers on demand. Again, you use the asyncio.sleep() function to simulate the awaitable object.

Here’s how the class works:

Shell
$ python async_range_v2.py
0
1
2
3
4

You get the same result as in the previous section, but instead of using an async iterator, you use an iterable.

With this quick background on async iterators and iterables, you can now dive deeper into how asynchronous iteration works and why you’d want to use it in your code.

Async Iteration

In Python, asynchronous iteration refers to traversing asynchronous iterables using async for loops. Under the hood, async for loops rely on async iterators to control the iteration process. Asynchronous iteration allows you to perform non-blocking operations within the loop.

Async iteration enables you to handle I/O-bound tasks efficiently and makes it possible to run tasks concurrently. Common I/O-bound tasks include the following:

  • File system operations, such as reading and writing files and accessing a file’s metadata like its size, creation date, and modification date.
  • Network operations, such as HTTP requests and socket communication.
  • Database operations, such as running SQL queries for CRUD (create, read, update, and delete) operations.
  • User input and output operations, such as reading the user input from the keyboard, mouse, or other input device and displaying output to the screen. These operations are critical in GUI (graphical user interface) applications, where rendering the interface can be resource-intensive.
  • External device communication, such as interacting with external sensors, printers, or other peripherals connected to serial or parallel ports.

In Python, asynchronous code runs in an event loop, which you typically start with the asyncio.run() function.

When you iterate over an async iterable using an async for loop, the loop gives control back to the event loop after each cycle so that other asynchronous tasks can run. This type of iteration is non-blocking because it doesn’t block the app’s execution while the loop is running.

Asynchronous iterators and iterables allow for asynchronous iteration, which lets you perform tasks concurrently.

Concurrency allows multiple tasks to progress by sharing time on the same CPU core or to run in parallel using multiple CPU cores. This programming technique can help you make your code more efficient. It also allows you to prevent blocking your program’s execution with time-consuming tasks like the ones listed above.

Asynchronous programming is a specific type of concurrency based on non-blocking operations and event-driven execution. That’s why async code runs in a main event loop, which takes care of handling asynchronous events.

In your asynchronous programming adventure in Python, you’ll probably be required to create your own asynchronous iterators and iterables. In practice, the preferred way to do this is using async generator iterators, which is the topic of the following section.

Creating Async Generator Functions

In Python’s documentation, an asynchronous generator is defined as shown below:

A function which returns an asynchronous generator iterator. It looks like a coroutine function defined with async def except that it contains yield expressions for producing a series of values usable in an async for loop. (Source)

For a quick illustration of a generator iterator, consider the following modification of your async range iterator:

Python async_range_v3.py
import asyncio

async def async_range(start, end):
    for i in range(start, end):
        await asyncio.sleep(0.5)
        yield i

async def main():
    async for i in async_range(0, 5):
        print(i)

asyncio.run(main())

An asynchronous generator is a coroutine function that you define using the async def keyword construct. The function must have a yield statement to generate awaitable objects on demand. In this example, you simulate the awaitable object with asyncio.sleep() as you’ve done so far.

Asynchronous generator functions can contain await expressions, async for loops, and async with statements. This type of function returns an asynchronous generator iterator that yields items on demand.

For a more elaborate example, say that you want to create a script to back up the files in a given directory. You want the script to process the files asynchronously and generate a ZIP file with the content.

Below is a possible implementation of your backup script. First, note that for the script to work, you need to install the aiofiles and aiozipstream packages from PyPI using pip and the following command:

Shell
$ python -m pip install aiofiles aiozipstream

Now that you have the external dependencies installed, you can take a look at the code:

Python compress.py
 1import asyncio
 2from pathlib import Path
 3
 4import aiofiles
 5from zipstream import AioZipStream
 6
 7async def stream_generator(files):
 8    async_zipstream = AioZipStream(files)
 9    async for chunk in async_zipstream.stream():
10        yield chunk
11
12async def main(directory, zip_name="output.zip"):
13    files = [
14        {"file": path}
15        for path in directory.iterdir()
16        if path.is_file()
17    ]
18    async with aiofiles.open(zip_name, mode="wb") as archive:
19        async for chunk in stream_generator(files):
20            await archive.write(chunk)
21
22directory = Path.cwd()
23asyncio.run(main(directory))

In this example, you first import the required modules and classes. Then, on line 7, you define an async generator function. In this function, you take a list of files as an argument. The items in this list must be dictionaries with a "file" key that maps to the file path. On line 8, you create an AioZipStream using the list of files as an argument.

On line 9, you start an async for loop over the stream of zipped data. By default, the .stream() method returns the zipped data as chunks of at most 1024 bytes. On line 10, you yield chunks of data on demand with the yield statement. This statement turns the function into an async generator.

On line 22, you create the directory variable to hold the target directory. In this example, you use Path.cwd() which gives you the current working directory. In other words, the directory defaults to the folder where your script is running. Finally, you run the event loop. If you run this script from your command line, then you’ll get a ZIP archive with the files in the script’s directory.

In practice, using async generator functions like the ones in this section is the quicker and preferred approach to creating async iterators in Python. However, if you need your iterators to maintain some internal state, then you can use class-based async iterators.

Creating Class-Based Async Iterators and Iterables

If you need to create async iterators that maintain some internal state, then you can create the iterator using a class. In this situation, your class must implement the .__aiter__() and .__anext__() special methods.

In the following sections, you’ll study .__aiter__() and .__anext__() in more detail. To kick things off, you’ll start by learning about the .__aiter__() method, which is part of the async iterators protocol and is the only method required for implementing async iterables. Then, you’ll learn about the .__anext__() method.

The .__aiter__() Method

When you create async iterators, the .__aiter__() method must be a regular method that immediately returns an async iterator object. The typical implementation of this method in an async iterator looks something like this:

Python
class AsyncIterator:
    def __aiter__(self):
        return self

There isn’t much to this implementation. You define the method as a regular instance method and return self, which holds the current object, and the object is the iterator.

When creating async iterables, you only need to implement the .__aiter__() method for the iterable to work. However, in this case, the method will have a more elaborate implementation that returns a proper async iterator object that yields items on demand.

In practice, you’ll often code .__aiter__() as an async generator function with the yield statement. For example, say that you want to create an async iterable to process large files. In this situation, you can end up with the following code:

Python large_file_iterable.py
import asyncio

import aiofiles

class AsyncFileIterable:
    def __init__(self, path, chunk_size=1024):
        self.path = path
        self.chunk_size = chunk_size

    async def __aiter__(self):
        async with aiofiles.open(self.path, mode="rb") as file:
            while True:
                chunk = await file.read(self.chunk_size)
                if not chunk:
                    break
                yield chunk

async def main():
    async for chunk in AsyncFileIterable("large-file.md"):
        # Process the file chunk here...
        await asyncio.sleep(0.2)
        print(chunk.decode("utf-8"))

asyncio.run(main())

In this example, the AsyncFileIterable class implements only the .__aiter__() method as an async generator function. The method opens the input file and reads it in chunks. Then, it yields file chunks on demand. With this iterator, you can process large files in chunks without blocking the script’s execution. That’s what you simulate in the script’s main() function.

Go ahead and run the script against one of your large files. To do this, update the path to your large file when you instantiate AsyncFileIterable in the main() function.

Another way to write the .__aiter__() method is to make it return an existing async iterator:

Python
class AsyncIterable:
    def __aiter__(self):
        return AsyncIterator(self)

class AsyncIterator:
    def __init__(self, iterable):
        self.iterable = iterable

    def __aiter__(self):
        return self

    async def __anext__(self):
        ...

In this example, the AsyncIterable class returns an instance of AsyncIterator from its .__aiter__() method.

The .__anext__() Method

Only async iterators need the .__anext__() method. This method should be async def because it needs to perform asynchronous operations to fetch the next item during iteration. So, the method should generally look something like this:

Python
async def __anext__(self):
    ...

The .__anext__() method must return an awaitable object. It can be a coroutine object or an object with an .__await__() method.

Another characteristic of .__anext__() is that it has to raise a StopAsyncIteration exception when the data stream is exhausted or consumed. This exception will tell Python to terminate the iteration process.

Here’s an example of creating an async iterator to process a large file in chunks. It works the same as the example in the previous section, but instead of using an async iterable, it uses an async iterator that implements both the .__aiter__() and .__anext__() methods:

Python large_file_iterator.py
import asyncio

import aiofiles

class AsyncFileIterator:
    def __init__(self, path, chunk_size=1024):
        self.path = path
        self.chunk_size = chunk_size
        self.file = None

    def __aiter__(self):
        return self

    async def __anext__(self):
        if self.file is None:
            self.file = await aiofiles.open(self.path, mode="rb")
        chunk = await self.file.read(self.chunk_size)
        if not chunk:
            await self.file.close()
            raise StopAsyncIteration
        return chunk

async def main():
    async for chunk in AsyncFileIterator("large-file.md"):
        # Process the file chunk here...
        await asyncio.sleep(0.2)
        print(chunk.decode("utf-8"))

asyncio.run(main())

In this example, the .__aiter__() method provides the minimal required implementation of an async iterator. It just returns self, which is the current object—the iterator itself.

Then, you define the .__anext__() method. First, you open the file asynchronously. Note that you can’t use an async with statement here because, if you did, you’d be opening and closing the file in every call to .__anext__(), and your code wouldn’t work.

Next, you read a chunk of the target file, which is your awaitable object. The second conditional statement checks whether the chunk holds data. If not, then you close the file and raise the StopAsyncIteration exception to signal that the data is exhausted. Finally, you return the awaitable object, chunk.

In main(), you iterate over the file’s chunks and process them. Go ahead and run the script. You’ll get the same result as in the previous section.

Using Async Iterators With Other Tools

Up to this point, you’ve coded several examples of async iterators. In most cases, you’ve used the iterators in async for loops. However, there are other constructs where you can use these iterators. You can also traverse them with the built-in anext() function or a comprehension.

In the following sections, you’ll learn how to use iterators with these alternative tools.

The Built-in anext() Function

You can use the built-in anext() function to traverse an async iterator one item at a time in a controlled way. It’s particularly useful when you need more granular control over the iteration process. For example, you may need to skip a few items from the iterator before getting to the data that you want to process.

Consider the following code example where you create an async iterator to process CSV files:

Python async_csv.py
import asyncio
import csv

import aiofiles

class AsyncCSVIterator:
    def __init__(self, path):
        self.path = path
        self.reader = None

    def __aiter__(self):
        return self

    async def __anext__(self):
        if self.reader is None:
            async with aiofiles.open(self.path, mode="r") as file:
                lines = await file.readlines()
                self.reader = csv.reader(lines)
        try:
            return next(self.reader)
        except StopIteration:
            raise StopAsyncIteration

async def main():
    csv_iter = AsyncCSVIterator("data.csv")
    # Skip the headers
    await anext(csv_iter)
    # Process the rest of the rows
    async for row in csv_iter:
        print(row)

asyncio.run(main())

In this example, the AsyncCSVIterator reads a CSV file’s content once in the .__anext__() method. The reading task runs asynchronously. Then, it returns a single line at a time.

In main(), you use anext() to skip the first row of the CSV file. This line typically contains the headers for your data. Then, you start a loop over the rest of the rows, which hold the actual data.

The anext() function can also help when you must iterate over potentially infinite async iterators. In this situation, using an async for loop may be inappropriate. Alternatively, you can use anext() in a while loop.

Consider the following async generator that yields potentially infinite integer numbers on demand:

Python inf_integers.py
import asyncio

async def async_inf_integers(start=0):
    current = start
    while True:
        yield current
        current += 1
        await asyncio.sleep(0.5)

This async generator function yields a potentially infinite stream of integer numbers. The call to asyncio.sleep() simulates an asynchronous operation here.

To process this iterator, you can use a while loop along with the anext() function instead of using an async for loop:

Python inf_integers.py
# ...

async def main(stop=5):
    generator = async_inf_integers()
    while True:
        number = await anext(generator)
        # Process the number here...
        print(number)
        if number == stop - 1:
            break

asyncio.run(main(20))

In this code snippet, you have a main() function that implements a potentially infinite while loop. The code explicitly communicates that you’re running a potentially infinite loop, which would be harder to communicate with an async for loop.

The anext() function lets you retrieve numbers from the async iterator on demand. Then, you can process the current number. Finally, you use the stop argument in a conditional to stop the loop.

Asynchronous Comprehensions and Generator Expressions

You can also use async iterators and iterables in asynchronous comprehensions. To create an async comprehension, you can use the following syntax:

  • List comprehensions: [item async for item in async_iter]
  • Set comprehensions: {item async for item in async_iter}
  • Dictionary comprehensions: {key: value async for key, value in async_iter}

These comprehensions look like regular comprehensions. The only difference is that you need to use the async for construct, and the async_iter object should be an asynchronous iterator, iterable, or generator.

To illustrate how async comprehensions work, consider the following example:

Python async_comp.py
import asyncio

async def async_range(start, end):
    for i in range(start, end):
        await asyncio.sleep(0.2)
        yield i

async def main():
    number_list = [i async for i in async_range(0, 5)]
    number_dict = {i: str(i) async for i in async_range(0, 5)}
    print(number_list)
    print(number_dict)

asyncio.run(main())

In the first highlighted line, you use a list comprehension to generate five integer numbers using the async_range() generator function. In the second highlighted line, you create a dictionary comprehension using the numbers as keys and their string representation as values.

When you run the example, you’ll have to wait for the code to complete, and then you’ll get the following output on your screen:

Shell
$ python async_comp.py
[0, 1, 2, 3, 4]
{0: '0', 1: '1', 2: '2', 3: '3', 4: '4'}

Both comprehensions work as expected. You can play around with other examples and generate a set of numbers, a list of squares, and so on.

Finally, you can also create asynchronous generator expressions with the following syntax:

Python
(item async for item in async_iter)

Async generator expressions are similar to async comprehensions, but the difference is that they use parentheses instead of other brackets. In this case, instead of a list, set, or dictionary, you get an async generator iterator. Then, you can use this iterator as you’d use a regular one.

Async Iterators in Concurrent Code

Asynchronous iterators shine when used in asynchronous apps that perform several other asynchronous tasks apart from just async iteration. In these situations, your async for loops can give control back to the app’s event loop so that it can run other tasks while waiting for time-consuming tasks to complete.

In the end, the purpose of asynchronous code is to allow you to execute multiple operations concurrently instead of sequentially, making your code more efficient and preventing unresponsive programs.

Up to this point, your code examples only show apps that loop over async iterables or iterators and don’t run other async tasks. This practice doesn’t make much sense because async for loops don’t run the iteration concurrently but sequentially. In other words, an async loop iterates over an item. When that iteration finishes, then the loop starts the next iteration, and so on, until it consumes the data.

The real benefit of an async loop in terms of efficiency comes when you run other asynchronous tasks while the loop is running a long-lasting iteration.

To illustrate this situation with an example, say that you have an AsyncCounterIterator that increments a count asynchronously. Here’s the code for this class:

Python counter.py
import asyncio
from random import randint

class AsyncCounterIterator:
    def __init__(self, name="", end=5):
        self.counter = 0
        self.name = name
        self.end = end

    def __aiter__(self):
        return self

    async def __anext__(self):
        if self.counter >= self.end:
            raise StopAsyncIteration
        self.counter += 1
        await asyncio.sleep(randint(1, 3) / 10)
        return self.counter

This counter increments the count using asyncio.sleep() to simulate awaitable objects with a random execution time.

In the code below, you create a task() function that iterates over an input async iterator and prints a message to the screen. The main() function calls task() twice. Each time, you pass a new instance of your iterator with a different name. Finally, you run the event loop as usual:

Python counter.py
# ...

async def task(iterator):
    async for item in iterator:
        print(item, f"from iterator {iterator.name}")

async def main():
    # This code runs sequentially:
    await task(AsyncCounterIterator("#1"))
    await task(AsyncCounterIterator("#2"))

asyncio.run(main())

In this example, the await statements run sequentially, which means that the second statement runs only after the first one has finished:

Shell
$ python counter.py
1 from iterator #1
2 from iterator #1
3 from iterator #1
4 from iterator #1
5 from iterator #1
1 from iterator #2
2 from iterator #2
3 from iterator #2
4 from iterator #2
5 from iterator #2

As you can conclude from this output, the calls to task() run sequentially. This means that your program can’t run a task from the second loop while the first loop is running. The ideal behavior will be that the first loop’s execution doesn’t block the execution of the second loop.

To fix this issue and make the code work concurrently, you can do something like the following:

Python counter.py
# ...

async def main():
    # This code runs concurrently:
    await asyncio.gather(
        task(AsyncCounterIterator("#1")),
        task(AsyncCounterIterator("#2")),
    )

asyncio.run(main())

In this update of your counter.py script, you use the asyncio.gather() function to run awaitable objects concurrently.

Now, when you run your script, you get an output similar to the following:

Shell
$ python counter.py
1 from iterator #1
2 from iterator #1
1 from iterator #2
2 from iterator #2
3 from iterator #1
3 from iterator #2
4 from iterator #1
4 from iterator #2
5 from iterator #1
5 from iterator #2

Note that the script now produces items from each task concurrently. This means that the first task doesn’t block the second task’s execution. This behavior can make your code more efficient in terms of execution time if the running tasks are I/O-bound and non-blocking operations.

Conclusion

Now you know how to write asynchronous iterators and iterables in Python. Asynchronous iterators are what Python uses to control async for loops, while asynchronous iterables are objects that you can iterate over using an async for loop, the built-in anext() function, or an async comprehension.

With async iterables and iterators, you can write non-blocking loops in your asynchronous code. This way, you can perform different tasks asynchronously.

In this tutorial, you’ve learned how to:

  • Differentiate async iterators and iterables in Python
  • Create async generator expressions and generator iterators
  • Write async iterators and iterables using .__aiter__() and .__anext__()
  • Use async iterators in async loops and comprehensions

With this knowledge, you can start creating and using asynchronous iterators and iterables in your code, making it faster and more efficient.

Take the Quiz: Test your knowledge with our interactive “Asynchronous Iterators and Iterables in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Asynchronous Iterators and Iterables in Python

Take this quiz to test your understanding of how to create and use Python async iterators and iterables in the context of asynchronous code.

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

Python Tricks Dictionary Merge

About Leodanis Pozo Ramos

Leodanis is an industrial engineer who loves Python and software development. He's a self-taught Python developer with 6+ years of experience. He's an avid technical writer with a growing number of articles published on Real Python and other sites.

» More about Leodanis

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

Master Real-World Python Skills
With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.


Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Keep Learning

Related Topics: advanced python