Demonstrating Race Conditions
00:01 In the previous lesson, I introduced you to the threading library in Python. In this lesson, I’m going to start a little side journey and show you what happens when race conditions complicate your concurrent program.
00:12 I have a little confession to make. I actually went to school for a lot of this work. My research and grad school was on concurrency, parallelism and distributed computing.
00:21 It’s been many, many years since I did that, and I’ve forgotten a lot, but there’s one key thing I do remember. Concurrency is hard. You should always choose carefully whether or not it’s the right thing to do.
00:33 It introduces all sorts of complications. In an earlier lesson, I covered a list that included things like deadlocks and resource starvation that make coding concurrency more complicated.
00:44 Race conditions are related to deadlocks in that the behavior of the program changes depending on the order of execution. This is particularly noticeable when two threads share the same memory and objects.
00:56 The nature of threaded computation means you can’t predict what order the code is going to run in, and depending on how those shared objects are used, it can affect the outcome.
01:05
Locking mechanisms and threading.local in Python can help you, but you have to remember when to use them when things are local and when they don’t need to be local.
01:15
A surprising number of things that are common in the programmer’s toolbelt are not thread safe. You’ve already seen the example of request session. The requests library itself is not thread safe.
01:26
If you’re doing multi-threaded programming with the requests library, you need to make sure that you’ve got locks in place. Even something as fundamental as print() isn’t thread safe.
01:35
It doesn’t happen often, but it is possible for two different threads to print() at the same time, and you end up with a print buffer partially populated from the first thread and then populated by the second thread and then populated by the first thread, which ends up being a mess on the screen.
01:51 If you’ve done any server-side web programming, you may have noticed that logging tends to be the way to go. Python’s logger is actually thread safe, and so multiple threads in a web server don’t cause this problem.
02:03 Unfortunately, it does this through locking, which slows things down, and this is the compromise you always have to do with concurrency. Potential speed up is there, but it’s often done at the expense of thread safety.
02:16 The best thing to do is assume that any libraries you’re using are not thread safe unless they explicitly say that they are. The program I’m about to show you has a race condition in it.
02:26 It’s also what we in the business call a convoluted example. You’ll never see code that actually looks like this, but it does show off the fact that race conditions can happen.
02:35 Let’s go take a look at this overly simplified example.
02:40 Race conditions are heinous bugs. The kinds of bugs that often disappear when you’re looking at them. If things are timed correctly, the bug doesn’t happen.
02:48
If things aren’t timed correctly, the bug does happen. That’s why I’m importing sleep(). I want to make the bug more likely to happen, so you can see it.
02:57 Since the timing is related to how the operating system schedules your threads, and how long each thread is given to execute, race conditions can be brutal to try and track down.
03:07
The race condition that I’m introducing is around a shared resource. This is a pretty common cause. In this case, that shared resource is a simple counter. change _counter is the entry point for each thread that I’m going to run.
03:21 It takes one argument, an amount to change the shared counter by. The shared resource is global, so it is shared. And here’s the convoluted part. Pretend that instead of just doing a local change to a single variable, that I was doing something more complicated like fetching a value from a server or a database.
03:40 Here I’ve fetched it and I’m keeping a local copy. Then I take a nap to simulate that this change process is slow. Then I update the local value and then send that new value up to the shared space.
03:54 This last call would be like me putting it back into the database or posting it to a server. These four lines are the heart of the race condition. Each thread grabs a local copy, edits it, then puts it back.
04:07 Say one of our threads is slow and another is fast. The slow thread grabs a copy, then goes to sleep. The fast thread grabs a copy, naps, updates its copy, and then puts it back.
04:19 When the slow thread finally catches up, it still has the original value it got from the server. It updates its local copy and sends it back, and thus completely overriding what the fast thread did.
04:31 This is the race because either of these threads can run in any order, the race mucks up the actual result. Scheduling by the operating system removes the determinism of your code.
04:45 The rest of this is just set up to get the threads going. This function takes one argument, which is how many threads to run. Here, I start out by initializing my shared resource.
04:56
I want the change_counter method to run 50 times. Half those times the change amount is plus one, the other half is minus one. The final result of our shared variable should be zero, as the total calculation nulls itself out.
05:10 This list comprehension contains the plus and minus ones that the workers will be invoked with.
05:16
Here I’m setting up a thread pool with the number of threads given by the function. Then mapping the plus and minus one data to those threads on the change_counter call.
05:26 Then finally, I print out the result. Let’s open up the REPL and give this a go.
05:33 Importing the function, and now I’ll run it with a single thread.
05:39 Good. 25 plus and 25 minus gives us zero.
05:44 Now let’s get concurrent. Not good, not zero. The race has mucked with our end result. It’s even worse than that though. Let me do it again.
05:55 Not only is it wrong, it’s not consistent.
05:59
Because the scheduling is influencing the outcome, the race condition means not only incorrect, but inconsistently incorrect results. Using the sleep() in the code almost guarantees that the problem shows up, but let’s say the sleep() wasn’t there.
06:13 You still have a race condition, but sometimes it might give the right answer, which is why these kinds of bugs are so tricky. You might get correct results and then the race will fire and then you won’t.
06:25 But the GIL is a lock, so won’t it save me? Next up—nope.
Christopher Trudeau RP Team on Jan. 5, 2021
Hi hauntarl,
The race condition is a caused by a combination of two things: the variable counter being global and shared across all threads, and the
fact that the change_counter() method can be interrupted at any time.
The crux of the matter is the assumption that “counter += amount” can’t be interrupted. If you put a semaphore lock around “counter += amount” the code would likely work fine.
I suspect what is happening is that “counter += amount” breaks down into several different instructions in the underlying virtual machine and occasionally it is being interrupted midway through those instructions.
The code you added to try to show the problem happening is only recording the state outside of the problem area: after the increment/decrement is done. To truly instrument the problem you would have to be able to inspect the value of “counter” during the “counter += amount” statement.
The reason I used two large ranges (once in the “data” variable, the second in the for-loop in change_counter() ) was to increase the likelihood of just such an interruption. If you drop those numbers down to lower values you might have to run the program a large number of times before you run into a problem – this is what makes race conditions so brutal to catch in real life.
hauntarl on Jan. 6, 2021
Thanks for the wonderful explanation, I think I finally get it.
If we try to breakdown the statement counter += amount in a 2 threaded system:
-
evaluation of expression
counter + amount: (Read operation)thread1:
counter = 0,amount = 1, evaluation results in1thread2:
counter = 0,amount = -1, evaluation results in-1
As there is no mutual exclusion, both the threads simultaneously read the same “counter” and are trying to add “amount” to it.
This here is a race condition as no thread is willing to wait for the other one to update the counter value after its evaluation, which results in one of them using the old counter value depending on which thread is able to perform write operation first.
-
assignment operation
counter = evaluated value: (Write operation)thread1:
counter = 1thread2:
counter = -1
Let us assume that “thread1” is able to perform the write operation before “thread2”, the new “counter” value becomes 1, soon after “thread2” performs its own write operation and writes -1 to the “counter” Finally the counter value results in -1, instead of 0 :)
Correct me if I am wrong and feel free to provide more input.
Christopher Trudeau RP Team on Jan. 6, 2021
Hi hauntarl,
Yep, you’ve got the right idea. The specifics are slightly different, but conceptually you’ve got it.
Remember that Python is an interpreted language. Statements in Python get translated into a series of steps called byte code, it is the byte code that actually gets run by the interpreter. Why do I bring this up? Because even things that look like single statements in Python aren’t necessarily atomic.
You can see the bytes that are run by the interpreter using the “compile()” built-in and the “dis” library:
>>> counter = 0
>>> amount = 1
>>> code = compile("counter += amount", "<string>", "exec")
>>> code
<code object <module> at 0x7fe81635ec90, file "<string>", line 1>
>>> import dis
>>> dis.disco(code)
1 0 LOAD_NAME 0 (counter)
2 LOAD_NAME 1 (amount)
4 INPLACE_ADD
6 STORE_NAME 0 (counter)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
Notice here that “counter += amount” turns into 6 separate byte code operations, and the thread interruption can happen during any of these steps.
The problem is the same as you described, just worse – there are five different places that this thing that looks like a single statement can get interrupted. This only gets worse for any larger chunk of code.
Satish on April 13, 2021
“ The sum total of the multiplication factors is not the same as the synchronous addition, so you get a race condition. ” This doesn’t intuitively explain why the race condition occurs as already pointed out in earlier comments . Essentially :
-
Multiple threads could be reading the same state of a particular variable ( in this case
counter) . -
Subsequently, out of the evaluations ( of the expression
counter + amount) being performed by multiple threads ( that would be active at that particular point in time ) , only one of the threads effectively ends up updatingcounter( overwriting the value assigned by the other threads) . This in turn could lead to either a ‘deficit’ or ‘surplus’ value ofcounterdepending on which threads ( the ones adding -1 v/s those adding +1 ) dominated the in ‘final’ assigment of counter in each group of currently active threads .
In the synchronous version, mathematical equilibrium is maintained because each evaluation of counter += amount forms the basis for the next
evaluation of this statement . But , in the concurrent version , that doesn’t happen as multiple threads could be using the same ‘base’ value for
starting their computation of counter + amount and as a result most of the iterations do not contribute towards maintaining the equilibrium of
counter .
Of course , this is an oversimplified explanation and not accurate ( given the complexities already pointed out the author in the comments ) .
hwoarang09 on May 30, 2022
I think Python uses GIL so only one thread excutes at any one time. When GIL is applied, race condition should occur but it is… Why does a race condition occur? (Sorry for my English)
Bartosz Zaczyński RP Team on May 30, 2022
@hwoarang09 The Global Interpreter Lock (GIL) ensures that only one CPython instruction can work at any given time, effectively making it an atomic operation that can’t be interrupted by another thread. However, in many cases, you’ll want to perform a transaction comprised of more than one instruction, which should all run as an atomic operation. Without explicit locking and mutexes, you can still experience a race condition.
marcinszydlowski1984 on Aug. 3, 2022
@hwoarang09 The Global Interpreter Lock (GIL) ensures that only one CPython instruction can work at any given time, effectively making it an atomic operation that can’t be interrupted by another thread. However, in many cases, you’ll want to perform a transaction comprised of more than one instruction, which should all run as an atomic operation. Without explicit locking and mutexes, you can still experience a race condition.
However, the situation can be easily observed if the counter is stored or cached locally in the context of specific thread and update only at the end of computations. Also, the bigger values of delay is simulating, the bigger chance of encountering corrupted data.
Here are some modifications of the code
Example from this video just only with counters cached locally (usage of “local_counter”)
# io_bound/race.py
from concurrent.futures import ThreadPoolExecutor
counter = 0
def change_counter(amount):
global counter
local_counter = counter
for _ in range(10000):
local_counter += amount
counter = local_counter
def race(num_threads):
global counter
counter = 0
data = [-1 if x %2 else 1 for x in range(1000)]
with ThreadPoolExecutor(max_workers=num_threads) as executor:
executor.map(change_counter, data)
print(counter)
More advanced example for both “manual” and “ThreadPool-way” of creating and running threads; also simulating delay as a parameter
import argparse
import threading
import random
import time
from typing import List, Tuple
from concurrent.futures import ThreadPoolExecutor
counter = 0
def change_counter(data: Tuple[List[int], bool]):
global counter
for d in data[0]:
local_counter = counter
local_counter += d
print(data[0], data[1])
if data[1]:
time.sleep(random.random())
counter = local_counter
print(f'{counter=} ({threading.current_thread().name})')
def prepare_data(num_threads: int) -> List[List[int]]:
"""
Gets sample data for threads. Each thread has its own array of three integers multiplied by index + 1.
Example:
num_threads = 2, data = [[1, 2, 3], [2, 4, 6]]
num_threads = 3, data = [[1, 2, 3], [2, 4, 6], [3, 6, 9]]
:param num_threads: number of threads
:return: array of 3-number arrays for each thread
"""
data = [[]] * num_threads
for n in range(num_threads):
data[n] = [(i + 1) * (n + 1) for i in range(3)]
print(data)
return data
def race_threads_manually(num_threads: int, simulate_delay: bool):
data = prepare_data(num_threads)
# Creating and running threads
T = [None] * num_threads
for it in range(len(T)):
T[it] = threading.Thread(target=change_counter, args=([(data[it], simulate_delay)]), name=f'Thread {it + 1}')
T[it].start()
# Waiting for all to finish
for it in range(len(T)):
T[it].join()
print(f'{counter=}')
def race_thread_pool(num_threads: int, simulate_delay: bool):
data = prepare_data(num_threads)
args = [(data[d], simulate_delay) for d in range(num_threads)]
with ThreadPoolExecutor(max_workers=num_threads) as executor:
executor.map(change_counter, args)
print(f'{counter=}')
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n', '--threads_number', type=int, help='Number of threads.', default=1)
parser.add_argument('-t', '--simulation_type', choices=['loop', 'pool'], help='The way of running threads: by thread pool or in a loop.', default='pool')
parser.add_argument('-d', '--simulate_delay', action='store_true', help='Simulates delay in milliseconds.', default=False)
args = parser.parse_args()
if args.simulation_type == 'pool':
func = race_thread_pool
else:
func = race_threads_manually
func(args.threads_number, args.simulate_delay)
Evangelos Kostopoulos on May 21, 2024
with Python 3.11.5 I am always getting the expected result 0(zero) no matter how many threads I am using. Am I missing something - meaning has anything changed recently?
race(1)
0
race(1000)
0
race(500)
0
Christopher Trudeau RP Team on May 21, 2024
Hi Evangelos,
Yes, I can confirm that it is no longer a race condition in Python 3.11. I dug around a bit and it looks like the problem existed in 3.9 and went away in 3.10.
I used the dis module to look at the bytecode created and 3.8 and 3.10 produce different bytecode. I also found a post on Reddit that talks about the fact that CPython introduced an optimization that changed which operations caused the GIL to be acquired and released.
For the sample code, the GIL isn’t being released as often, and as the GIL is held throughout the code, that ensures you no longer have a race condition.
Beyond the problem that this makes it a lousy example, unless you’re using Python 3.9 or earlier, there is another complication: the language spec does not require the compiler to behave this way. Some other optimization could possibly cause this code to have a race condition in it again in a future version of Python.
Furthermore, this is CPython’s way of doing it, if you switched to a different interpreter like PyPy, you might see different behaviour. This is one of the reasons race conditions are so tricky to find and debug.
The post on Reddit is quite detailed, in case you’re interested:
And the original thread if you want to see what others said:
www.reddit.com/r/learnprogramming/comments/16mlz4h/race_condition_doesnt_happen_from_python_310/
Welcome to multi-threading, having fun yet? ;)
Michael Cheung on June 11, 2025
I used Python 3.12.6 and seems that race(n) always gives 0? Is that a problem of python version? If so, how can I reproduce it in my windows laptop?
Michael Cheung on June 11, 2025
Hi Christopher, I saw your previous reply right after my new post. Just would like to further ask how I can use a different version of python? I installed python by downloading it from www.python.org/downloads/
Christopher Trudeau RP Team on June 11, 2025
Hi Michael,
I’m not a Windows guy, so my instructions are going to be a bit vague. When you install a version of Python, you’ll get an interpreter named “python3” as well as one named with the whole version “python3.12”, for example. On Windows these may be in your path. If you run the fully qualified name you’ll get the different interpreter.
There are also tools out there to help you with this. If you’re using an IDE like VisualStudio, it should detect that there are multiple interpreters and let you select which one you want when you create a new project. You could also use tools like uv and pyenv to manage these things.
All that being said, the other thing you could do is look at the code in this tutorial and associated course:
realpython.com/python-thread-lock/
It creates a race condition that still happens in more recent versions of Python.
Hope that helps. …ct
Become a Member to join the conversation.

hauntarl on Jan. 5, 2021
I tried to keep track of each thread, what amount it is using and how many iterations it is doing with that amount.
From the above output, it is clearly visible that each thread is doing different number of iterations for different amount value, but if we take the delta for iterations done for amount = 1 and amount = -1, it comes out as zero. Yet the counter value comes out as some arbitrary number instead of 0.
I am not able to make sense of this output :(
Also I am finding it difficult to walk through the program and understand exactly how the race condition is occurring.