Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

Thread Pool

Give Feedback

In this lesson, you’ll refactor your previous code by using a thread pool executor from the concurrent.futures module. If you download the sample code, you can get your own copy of 07-thread_pool.py:

Download

Sample Code (.zip)

12.9 KB

To learn more, you can also check out the documentation for concurrent.futures.ThreadPoolExecutor and concurrent.futures.Executor.map.

Comments & Discussion

nightfury on Dec. 26, 2019

Hi,

Would this be a cool way to find the number of threads supported by an OS ? Out of my curiosity I modified the code a bit to see if it crashes beyond certain point

import threading
import time
import concurrent.futures

def my_func(name):
    print(f'my_func started with {name}')
    time.sleep(5)
    print(f'my_func ended with {name}')

if __name__ == '__main__':
    max_workers = 5000
    print('Main started')

    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as e:
        arg_list = ['func_' + str(i+1) for i in range(0,max_workers)]
        e.map(my_func,arg_list)

    print('Main ended')

On my system (running mac OS) , the code crashes beyond max_workers=2048 with the following error

RuntimeError: can't start new thread

Any comments on whats happening here ?

Lee RP Team on Jan. 12, 2020

Hey @nightfury, that’s cool :) I imagine there is a limit of number of threads per process the OS can create. Looks like its 2048 on yours. I just ran it on my machine and it stopped at 4096.

Ahmed on April 16, 2020

Hello Lee,

I am trying to make my code run faster. I have 6000 product sku’s in a file and I make API call for each product. This is taking about 20-30 mins to finish all the API calls, can you recommend me a faster way to do this. Below is my code.

import os
import csv
import json
import time
import requests
import concurrent.futures


product = []
parent = "ProductFiles"
filename3 = os.path.join(parent, 'product_info.csv')
file3 = open(filename3)
wrapper = csv.reader(file3)
for row in wrapper:
    product.append(row[0])


def get_product_info():
    product_response = {}

    for row in product[1:]:
        url = url

        response = requests.request("GET", url)
        response = json.loads(response.text)

        if product_response is None:
            product_response[row] = response
        else:
            product_response.update({row:response})

    print(product_response)


if __name__ == '__main__':
    start = time.time()
    with concurrent.futures.ThreadPoolExecutor()as e:
        for i in product:
            e.map(get_product_info(), i)

    end = time.time()
    print("Total Time: "+str(end - start))

Few questions:

1.If I don’t mention any number in ThreadPoolExecutore then how many threads does it start or what’s the default number of threads that runs?

  1. How do i return values from all the threads and gather them or append in same variable ?

khurram703 on July 19, 2020

for me, i just checked that how many threads i can create and the # reached up to 500000 without any issue. Is it possible Mr. Lee?

Become a Member to join the conversation.