Loading video player…

What Is asyncio?

This lesson covers what Async IO is and how it’s different from multiprocessing. You’ll see what happens when your code makes responses, how the requests are dispatched to the outside world through the event loop, how responses are handled, and how asyncio ties into all this.

00:00 So, what is asyncio actually doing? Well, let’s juxtapose or compare it to the multiprocessing library. In multiprocessing, what you’re doing is—something is slow, so you’re effectively creating copies. You’re creating multiple processes, so instead of just running your code in one process, you run it in two processes, three processes, four processes.

00:21 And these processes usually run on cores on your CPU, so if you have an 8-core CPU and if you have eight processes, then you could run, say, one process on each CPU.

00:33 In that case, everything would be running at the same time, in parallel.

00:39 But, like I mentioned, if your app is IO-bound, if you’re doing a lot of IO processing, then instead of using multiprocessing what you can do is instead of use asyncio.

00:49 Now, what asyncio does is this: it’s only one process and one thread within one process, so it’s effectively doing just one thing at a time.

00:58 What happens is you’re going to send out a bunch of requests inside your Python code. So, imagine this is your Python code here, and let’s say you write to the file system, write to the file system, query the database, query the database, and make an HTTP request, make another HTTP request—so, these are all sorts of different work requests that you’re going to do from your Python code. But you’re going to do them one at a time. You’re going to say “Do this, now do the next thing, now do the next thing, now do the next thing.” So inside of here, they’re going to run one at a time—sequentially.

01:30 So, they’re going to leave and they’re going to go into this thing called the event loop, and this event loop is inside of Python too. Now, what the event loop is going to do is it’s going to send out these requests—these requests are coming into the event loop—it’s going to send them out to the external world, meaning to the file system. So you could write something to the file system, to some external database, to some external computation—to something external to your Python program. It’s going to send it out, okay?

01:57 So all these things are being sent out. Now, once they’re sent out, they could potentially be running in parallel then. So, they’re out, they’re running, the database is querying some stuff, the file system is writing some stuff, you’re sending some requests to a website, and then these responses come back. Responses—maybe from a file system, maybe you’re reading a file. Maybe the response comes back from a database, maybe you’re querying a database, et cetera.

02:18 So, these responses come back, and once it comes back, then you’re going to get an operation complete from the operating system, and the operating system is going to let Python know, it says, “Hey, this thing is finished.” Then the event loop is going to remember, “Oh, you did five, you sent out five requests.” And so once the response comes back in, then I can take that response and say, “Oh, this response was to the request of”—let’s say you did a request for “Read something from the file system.” Once that was finished, the response comes back from the file system saying, “Hey, this is finished.” And then it triggers what’s called a callback, and the event loop calls into Python. It says, “Hey, Python. This file has now been read,” or “This database query has now been fulfilled,” or “This some sort of external computation is finished.” Then it continues running sequentially in here.

03:05 So, basically, the request gets sent out, it goes into the event loop, the event loop fires it off. It runs in whatever manner necessary—it could be in parallel, it could not be parallel.

03:15 It doesn’t really matter from Python’s perspective. But typically it runs pretty fast. And then once the response comes back, the event loop lets Python know that this thing is finished running and then it continues running.

03:25 But something interesting happens when the request comes out. You say, for example, “Read something from a file.” Well, what Python is going to do is it’s actually going to pause the execution of that function, and that function is called a coroutine, and we’ll be talking about coroutines later, but it pauses the execution of that coroutine until later on when the event loop says, “Hey, this thing is finished!” and then the function, or that coroutine, gets reentered and it resumes execution. So, once again, this sounds a little bit confusing, but I’m going to be showing you some code in a minute and hopefully it’ll be less confusing then. But just once again, as an overview, these requests happen in Python, they happen one at a time, they go into the event loop, the event loop sends out the requests, and then once they come back and—like I said—once the requests are sent out, they could be potentially running actually in parallel.

04:20 So this could be the parallel section here. And then the response comes back and then this event loop notifies Python, “Hey, something has just finished and you may want to continue running.” Okay?

04:31 So, Python runs these things one at a time, but out here in the real world, it could be running in parallel. So that’s what it does. Once again, I think if I show you some of the code for this, it’ll help this gel a little bit. And I would, if I were you, come back to this slide and kind of just like refer to this slide as you’re going through the code, and then maybe it’ll make these two things gel together. Okay. With that, let’s start coding.

Avatar image for usuarioman

usuarioman on Feb. 5, 2021

If I understand it (I hope), I add some things to the diagram, some feedback about its validity is appreciated.


I have another questions about the execution model presented in the diagram, my confusion arise from the distinction between multithreading and coroutine. Please, let me know if I am wrong and why:

In the diagram, the request is made by a coroutine and managed by the event loop, instead of waiting for the response, the coroutine voluntarily stop his own execution and pass/yield control to other coroutine o subprocess. This process is known as cooperative multitasking.

Until here, the next doubts don’t let me sleep (lol):

  • Does the coroutine is aware of the external world or just don’t care?
  • How context switching happens between coroutines? I think that the OS is in charge of this in preemptive multitasking and only apply for threads.
  • What happens if the coroutine doesn’t yield for any reason.
  • Finally, coroutine switching take place between what? I know, a coroutine but … is this a process or a subprocess? I think is between a subprocess, therefore I asume that coroutines share the same resources.

So what is the difference with multithreading? JAJAJA this is so difficult lol.

Avatar image for Bartosz Zaczyński

Bartosz Zaczyński RP Team on Feb. 9, 2021

@usuarioman Don’t worry. It can be challenging to wrap your head around coroutines when you first learn about them. Let me try to answer your questions.

Is the coroutine aware of the external world or doesn’t care?

A coroutine is a piece of code enclosed in a function that can halt and resume its execution multiple times. Unless you provide some context through an argument or a global variable, it won’t know anything about the external world.

How does context switching happen between coroutines? The OS is in charge of this in preemptive multitasking and only applies to threads.

Coroutines don’t fall under the category of preemptive multitasking. They use cooperative multitasking, which means they voluntarily give up (or not) CPU time. You decide when, how often, and to whom switch the context using the yield keyword followed by another coroutine’s name. It makes the code much easier to follow and debug because context switching points are deterministic and known upfront.

What happens if the coroutine doesn’t yield for any reason?

In the old days, you had to use frameworks such as Twisted or Tornado to artificially turn generator functions into coroutines since they weren’t natively supported by the language at the time. Such functions had to use the yield keyword inside their body to become coroutines. Otherwise, they were just regular functions.

Today, you can declare a coroutine by marking a function with the async keyword:

async def main():
    print("hello world")

In most cases, you’ll want to use its twin brother, the await keyword, to “yield” or halt the execution and switch the context to another coroutine without blocking the main thread, for example:

import asyncio

async def main():
    await asyncio.sleep(1)
    print("hello world")

However, awaiting another coroutine isn’t mandatory. It’s just that such coroutines won’t take advantage of asynchronous execution provided by the event loop.

Coroutine switching takes place between what?

Whether you’re talking about processes, threads, or coroutines, context switching boils down to allocating CPU time to one of those resources. Processes and threads are typically managed by the operating system’s scheduler, which forcefully halts their execution, hibernates their state, and takes away CPU for a fraction of time. The exact moment of a context switch, its duration, and the next process or thread to resume execution remains unknown.

Coroutines, on the other hand, voluntarily give up CPU time at predictable moments and to specific coroutines that they call out by name. There’s no need for a scheduler.

So what is the difference with multithreading?

Both threads and coroutines let you run code concurrently. Even though threads are a system-level resource, which could take advantage of multiple CPU cores, the Global Interpreter Lock (GIL) in CPython makes all threads run on a single core at a time. Therefore, neither threads nor coroutines can run truly in parallel in Python. They’re only good for so-called I/O-bound tasks, which involve a lot of waiting for data over the network or disk. When there’s nothing to do, another thread or coroutine will make better use of the CPU.

Both share memory within a single process, making them more lightweight than full-blown OS processes. However, threads are more heavyweight than coroutines. Typically, you can have thousands of I/O-bound threads in a single process and tens of thousands of coroutines to handle HTTP requests, for example.

The downside of using coroutines is that they can’t call blocking operations. Most of the standard library in Python is comprised of blocking functions. You need to be careful to use their non-blocking counterparts if available or find a third-party library as a substitute. If that fails, you might wrap a blocking call in a thread.

Avatar image for usuarioman

usuarioman on Feb. 21, 2021

W0o0ow … thank you a lot for your answer, this is amazing !! Thanks, I have a better understanding now, you are the best!

Become a Member to join the conversation.