from pprint import pprint
from concurrent.futures import ThreadPoolExecutor
import threading

counter = 0
thread_iter = dict()

def change_counter(amount):
    global counter
    # to keep track of all the threads and total iterations each one is doing
    # along with the amount value that thread has been assigned
    if (threading.get_ident(), amount) not in thread_iter:
        thread_iter[(threading.get_ident(), amount)] = 0

    for _ in range(10000):
        thread_iter[(threading.get_ident(), amount)] += 1
        counter += amount

def race(num_threads):
    global counter
    counter = 0
    data = [-1 if x % 2 else 1 for x in range(1000)]

    # when the num_threads value is 1, the program is synchronous
    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        executor.map(change_counter, data)

    pprint(thread_iter)
    print(f'counter value(ideally should be 0): {counter}')
    print(f'total iterations (1,000 x 10,000): {sum(thread_iter.values())}')

    delta = 0
    for k, v in thread_iter.items():
        delta += v if k[1] > 0 else -v
    print(f'delta (positive iterations - negative iterations): {delta}')

if __name__ == '__main__':
    # race(1)
    race(5)  # increase num_threads to consistently achieve race condition

"""
OUTPUT: 
{(1272, -1): 960000,
 (1272, 1): 1020000,
 (5704, -1): 940000,
 (5704, 1): 930000,
 (7960, -1): 1000000,
 (7960, 1): 1050000,
 (12428, -1): 1000000,
 (12428, 1): 900000,
 (12800, -1): 1100000,
 (12800, 1): 1100000}
counter value(ideally should be 0): 111490
total iterations (1,000 x 10,000): 10000000
delta (positive iterations - negative iterations): 0
"""

I tried to keep track of each thread, what amount it is using and how many iterations it is doing with that amount.

From the above output, it is clearly visible that each thread is doing different number of iterations for different amount value, but if we take the delta for iterations done for amount = 1 and amount = -1, it comes out as zero. Yet the counter value comes out as some arbitrary number instead of 0.

I am not able to make sense of this output :(

Also I am finding it difficult to walk through the program and understand exactly how the race condition is occurring.

Christopher Trudeau RP Team on Jan. 5, 2021

Hi hauntarl,

The race condition is a caused by a combination of two things: the variable counter being global and shared across all threads, and the fact that the change_counter() method can be interrupted at any time.

The crux of the matter is the assumption that “counter += amount” can’t be interrupted. If you put a semaphore lock around “counter += amount” the code would likely work fine.

I suspect what is happening is that “counter += amount” breaks down into several different instructions in the underlying virtual machine and occasionally it is being interrupted midway through those instructions.

The code you added to try to show the problem happening is only recording the state outside of the problem area: after the increment/decrement is done. To truly instrument the problem you would have to be able to inspect the value of “counter” during the “counter += amount” statement.

The reason I used two large ranges (once in the “data” variable, the second in the for-loop in change_counter() ) was to increase the likelihood of just such an interruption. If you drop those numbers down to lower values you might have to run the program a large number of times before you run into a problem – this is what makes race conditions so brutal to catch in real life.

hauntarl on Jan. 6, 2021

Thanks for the wonderful explanation, I think I finally get it.

If we try to breakdown the statement counter += amount in a 2 threaded system:

evaluation of expression counter + amount: (Read operation)

thread1: counter = 0, amount = 1, evaluation results in 1

thread2: counter = 0, amount = -1, evaluation results in -1

As there is no mutual exclusion, both the threads simultaneously read the same “counter” and are trying to add “amount” to it.

This here is a race condition as no thread is willing to wait for the other one to update the counter value after its evaluation, which results in one of them using the old counter value depending on which thread is able to perform write operation first.

assignment operation counter = evaluated value: (Write operation)

thread1: counter = 1

thread2: counter = -1

Let us assume that “thread1” is able to perform the write operation before “thread2”, the new “counter” value becomes 1, soon after “thread2” performs its own write operation and writes -1 to the “counter” Finally the counter value results in -1, instead of 0 :)

Correct me if I am wrong and feel free to provide more input.

Christopher Trudeau RP Team on Jan. 6, 2021

Hi hauntarl,

Yep, you’ve got the right idea. The specifics are slightly different, but conceptually you’ve got it.

Remember that Python is an interpreted language. Statements in Python get translated into a series of steps called byte code, it is the byte code that actually gets run by the interpreter. Why do I bring this up? Because even things that look like single statements in Python aren’t necessarily atomic.

You can see the bytes that are run by the interpreter using the “compile()” built-in and the “dis” library:

>>> counter = 0
>>> amount = 1
>>> code = compile("counter += amount", "<string>", "exec")
>>> code
<code object <module> at 0x7fe81635ec90, file "<string>", line 1>
>>> import dis
>>> dis.disco(code)
  1           0 LOAD_NAME                0 (counter)
              2 LOAD_NAME                1 (amount)
              4 INPLACE_ADD
              6 STORE_NAME               0 (counter)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

Notice here that “counter += amount” turns into 6 separate byte code operations, and the thread interruption can happen during any of these steps.

The problem is the same as you described, just worse – there are five different places that this thing that looks like a single statement can get interrupted. This only gets worse for any larger chunk of code.

Satish on April 13, 2021

“ The sum total of the multiplication factors is not the same as the synchronous addition, so you get a race condition. ” This doesn’t intuitively explain why the race condition occurs as already pointed out in earlier comments . Essentially :

Multiple threads could be reading the same state of a particular variable ( in this case counter ) .
Subsequently, out of the evaluations ( of the expression counter + amount ) being performed by multiple threads ( that would be active at that particular point in time ) , only one of the threads effectively ends up updating counter ( overwriting the value assigned by the other threads) . This in turn could lead to either a ‘deficit’ or ‘surplus’ value of counter depending on which threads ( the ones adding -1 v/s those adding +1 ) dominated the in ‘final’ assigment of counter in each group of currently active threads .

In the synchronous version, mathematical equilibrium is maintained because each evaluation of counter += amount forms the basis for the next evaluation of this statement . But , in the concurrent version , that doesn’t happen as multiple threads could be using the same ‘base’ value for starting their computation of counter + amount and as a result most of the iterations do not contribute towards maintaining the equilibrium of counter .

Of course , this is an oversimplified explanation and not accurate ( given the complexities already pointed out the author in the comments ) .

hwoarang09 on May 30, 2022

I think Python uses GIL so only one thread excutes at any one time. When GIL is applied, race condition should occur but it is… Why does a race condition occur? (Sorry for my English)

Bartosz Zaczyński RP Team on May 30, 2022

@hwoarang09 The Global Interpreter Lock (GIL) ensures that only one CPython instruction can work at any given time, effectively making it an atomic operation that can’t be interrupted by another thread. However, in many cases, you’ll want to perform a transaction comprised of more than one instruction, which should all run as an atomic operation. Without explicit locking and mutexes, you can still experience a race condition.

marcinszydlowski1984 on Aug. 3, 2022

@hwoarang09 The Global Interpreter Lock (GIL) ensures that only one CPython instruction can work at any given time, effectively making it an atomic operation that can’t be interrupted by another thread. However, in many cases, you’ll want to perform a transaction comprised of more than one instruction, which should all run as an atomic operation. Without explicit locking and mutexes, you can still experience a race condition.

However, the situation can be easily observed if the counter is stored or cached locally in the context of specific thread and update only at the end of computations. Also, the bigger values of delay is simulating, the bigger chance of encountering corrupted data.

Here are some modifications of the code

Example from this video just only with counters cached locally (usage of “local_counter”)

# io_bound/race.py
from concurrent.futures import ThreadPoolExecutor
counter = 0

def change_counter(amount):
    global counter
    local_counter = counter
    for _ in range(10000):
        local_counter += amount

    counter = local_counter

def race(num_threads):
    global counter
    counter = 0
    data = [-1 if x %2 else 1 for x in range(1000)]

    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        executor.map(change_counter, data)

    print(counter)

More advanced example for both “manual” and “ThreadPool-way” of creating and running threads; also simulating delay as a parameter

import argparse
import threading
import random
import time

from typing import List, Tuple
from concurrent.futures import ThreadPoolExecutor


counter = 0

def change_counter(data: Tuple[List[int], bool]):
    global counter

    for d in data[0]:
        local_counter = counter
        local_counter += d

        print(data[0], data[1])
        if data[1]:
            time.sleep(random.random())

        counter = local_counter

        print(f'{counter=} ({threading.current_thread().name})')


def prepare_data(num_threads: int) -> List[List[int]]:
    """
    Gets sample data for threads. Each thread has its own array of three integers multiplied by index + 1.
    Example:
        num_threads = 2, data = [[1, 2, 3], [2, 4, 6]]
        num_threads = 3, data = [[1, 2, 3], [2, 4, 6], [3, 6, 9]]
    :param num_threads: number of threads
    :return: array of 3-number arrays for each thread
    """
    data = [[]] * num_threads

    for n in range(num_threads):
        data[n] = [(i + 1) * (n + 1) for i in range(3)]

    print(data)
    return data

def race_threads_manually(num_threads: int, simulate_delay: bool):
    data = prepare_data(num_threads)

    # Creating and running threads
    T = [None] * num_threads

    for it in range(len(T)):
        T[it] = threading.Thread(target=change_counter, args=([(data[it], simulate_delay)]), name=f'Thread {it + 1}')
        T[it].start()

    # Waiting for all to finish
    for it in range(len(T)):
        T[it].join()

    print(f'{counter=}')

def race_thread_pool(num_threads: int, simulate_delay: bool):
    data = prepare_data(num_threads)
    args = [(data[d], simulate_delay) for d in range(num_threads)]

    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        executor.map(change_counter, args)

    print(f'{counter=}')

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-n', '--threads_number', type=int, help='Number of threads.', default=1)
    parser.add_argument('-t', '--simulation_type', choices=['loop', 'pool'], help='The way of running threads: by thread pool or in a loop.', default='pool')
    parser.add_argument('-d', '--simulate_delay', action='store_true', help='Simulates delay in milliseconds.', default=False)
    args = parser.parse_args()

    if args.simulation_type == 'pool':
        func = race_thread_pool
    else:
        func = race_threads_manually

    func(args.threads_number, args.simulate_delay)

Evangelos Kostopoulos on May 21, 2024

with Python 3.11.5 I am always getting the expected result 0(zero) no matter how many threads I am using. Am I missing something - meaning has anything changed recently?

race(1)
0

race(1000)
0

race(500)  
0

Christopher Trudeau RP Team on May 21, 2024

Hi Evangelos,

Yes, I can confirm that it is no longer a race condition in Python 3.11. I dug around a bit and it looks like the problem existed in 3.9 and went away in 3.10.

I used the dis module to look at the bytecode created and 3.8 and 3.10 produce different bytecode. I also found a post on Reddit that talks about the fact that CPython introduced an optimization that changed which operations caused the GIL to be acquired and released.

For the sample code, the GIL isn’t being released as often, and as the GIL is held throughout the code, that ensures you no longer have a race condition.

Beyond the problem that this makes it a lousy example, unless you’re using Python 3.9 or earlier, there is another complication: the language spec does not require the compiler to behave this way. Some other optimization could possibly cause this code to have a race condition in it again in a future version of Python.

Furthermore, this is CPython’s way of doing it, if you switched to a different interpreter like PyPy, you might see different behaviour. This is one of the reasons race conditions are so tricky to find and debug.

The post on Reddit is quite detailed, in case you’re interested:

www.reddit.com/r/learnprogramming/comments/16mlz4h/comment/k198umz/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

And the original thread if you want to see what others said:

www.reddit.com/r/learnprogramming/comments/16mlz4h/race_condition_doesnt_happen_from_python_310/

Welcome to multi-threading, having fun yet? ;)

Michael Cheung on June 11, 2025

I used Python 3.12.6 and seems that race(n) always gives 0? Is that a problem of python version? If so, how can I reproduce it in my windows laptop?

Michael Cheung on June 11, 2025

Hi Christopher, I saw your previous reply right after my new post. Just would like to further ask how I can use a different version of python? I installed python by downloading it from www.python.org/downloads/

Christopher Trudeau RP Team on June 11, 2025

Hi Michael,

I’m not a Windows guy, so my instructions are going to be a bit vague. When you install a version of Python, you’ll get an interpreter named “python3” as well as one named with the whole version “python3.12”, for example. On Windows these may be in your path. If you run the fully qualified name you’ll get the different interpreter.

There are also tools out there to help you with this. If you’re using an IDE like VisualStudio, it should detect that there are multiple interpreters and let you select which one you want when you create a new project. You could also use tools like uv and pyenv to manage these things.

All that being said, the other thing you could do is look at the code in this tutorial and associated course:

realpython.com/python-thread-lock/

It creates a race condition that still happens in more recent versions of Python.

Hope that helps. …ct

Become a Member to join the conversation.