I’m wondering what the threaded version of the program would look like if we use threading primitives instead of the executor. How would the download_all_sites function change?

Christopher Trudeau RP Team on April 23, 2021

Without the executor, you’d need to write code that creates an individual thread (or 5 of them if you were doing the exact same thing) and then start the thread. You’d also need to do a join on all of them afterwards.

The key part would be how you distribute the data amongst those threads – you could just put a thread constructor in a loop, creating 5 instances, but you’d be missing the mapping to the subset of sites. You likely would slice the list of sites, giving the first thread 1/5 of the work.

To see code examples without the executor, take a look at the following article:

realpython.com/intro-to-python-threading/

andrewodrain on May 5, 2023

I always come out of your lectures with an extremely deep understanding of the topics you present. After seeing concurrency in action, I think I am addicted. Excellent work Christopher! Thank You!

Christopher Trudeau RP Team on May 6, 2023

Glad you’re finding it useful Andrew. Happy coding!

Tony Ngok on Feb. 6, 2024

I’ve learnt later in the course that asyncio only uses 1 CPU. So, is it why asyncio is better than thread for I/O bound programming?
Also, is it that threading uses multiple CPUs (i.e., n CPUs for 2n threads)?

Bartosz Zaczyński RP Team on Feb. 6, 2024

@tonyn1999 The async/await approach and threading are two alternative paradigms in concurrent programming, both of which have their pros and cons, so you can’t say that one is always better than the other. That said, asynchronous programming is generally more scalable for I/O-bound tasks thanks to the cheap cost of context switching compared to threads.

The major downside of asyncio in Python is that not every library provides an asynchronous API. Also, mixing synchronous and asynchronous code can be challenging. If you miss a single blocking code, then that will affect your entire program, bringing it to a complete halt. Finally, getting used to the asynchronous paradigm takes time since it has a steep learning curve.

Threads in Python can use multiple CPU cores but not simultaneously because of the global interpreter lock (GIL), which ensures that only one thread runs at a time. However, there are clever ways to bypass the GIL for true parallelism.

Christopher Trudeau RP Team on Feb. 7, 2024

Hi @tony,

Bartosz has been doing a great job answering questions, but I thought I’d stick my $0.02 in here as well.

Threading and multi-processing has a long history in computing and in most languages you’ll find multiple ways of attacking the same problems.

Way back in olden times, multi-processing was the only way of doing concurrency. But it has a lot of overhead, as you’re essentially keeping two copies of the program active at a time. Threads were invented as a lighter weight solution. Both threading and multi-processing were first offered as features of the operating system.

Programming languages then provided interfaces to those operating system features. But, as cross-platform languages started popping up, some of them decided they didn’t want to have to deal with the OS and invented their own kind of threading, known as “Green Threads”. The idea is the same, but now the programming language is responsible for when the threads switch, rather than it being the OS’s scheduler.

As Python has been around for a while, it has flavours of all these things. The async/await really is just another way of tackling threads. At the most abstract level, it is no different than using threads. But, because it is built into the language there are optimizations there which may mean it is better in certain cases.

Threading and async/await are both IO bound currency solutions.

Multi-processing, as its name implies, uses multiple processors and so works in the CPU-bound case. Of course, to make things more complicated, there is nothing stopping a thread from using the multi-processing library, or a MP code from spinning up threads.

As with all things “performance” in computing, the first thing you should do if you’re trying to make it faster is measure it. Figure out what the bottlenecks are and address those with the appropriate tools.

Thankfully, threads, async, and MP, all use very similar mechanisms, so it doesn’t take much to swap them out and see which gives you the most improvement.

Hope that was worth $0.02 :)

Become a Member to join the conversation.