CPython's Memory Management

How Python Manages Memory Austin Cepalia 04:19

Transcript
Discussion (2)

00:00 You’ve learned why memory management is important and what process is responsible for doing it. You’ve also seen how the GIL acts as a safeguard to ensure memory is not mishandled due to multithreaded race conditions.

00:15 The rest of this course will focus on how CPython utilizes memory when our program needs it. Before we do that, some vocabulary. I’ve used some of this throughout the course, but I want to make it very clear what I’m talking about here. To allocate memory is to make the memory accessible.

00:33 Your operating system might allocate more memory to a Python process if it needs to store a lot of variables. To free memory is to delete the data stored there.

00:44 Some programs might give the memory back to the operating system to manage, but CPython tends to hold onto it, as you’ll see later. You may also hear the term release memory.

00:56 CPython never sees your actual computer memory. Instead, your operating system creates what is called virtual memory, which is used by processes running on your computer.

01:08 This is an abstraction layer that allows the OS to decide how much memory each process should get. Here’s what that looks like for a currently running Python program.

01:18 This entire block of memory is the virtual memory carved out by the operating system. The CPython process running the program was allocated this amount of memory right here.

01:30 Everything else is not accessible to CPython. Part of CPython’s allocated memory is used for internal purposes that have nothing to do with your code specifically. The other part is object-specific memory.

01:44 Python knows how to allocate memory for common types like ints or dictionaries.

01:50 However, collection types like the list or the dictionary, can contain objects of any type. And so for that, CPython has an object allocator that is responsible for allocating memory within the object memory area.

02:06 This object allocator gets called every time a new object needs space allocated. Because most memory allocation involves dealing with small amounts of data—for example, changing a string variable or adding a new integer to a list—it’s tuned to work well with small chunks of data.

02:25 It tries not to allocate more memory unless it’s absolutely required.

02:31 CPython’s memory allocation strategy involves three main parts: arenas, pools, and blocks. I’ll use the book analogy to explain these. At the highest level, we have the arena.

02:45 These are the largest fixed-sized chunk of memory aligned on a page boundary in memory. A page boundary is the edge of a fixed-length contiguous chunk of memory that the OS uses. Python assumes the system’s page size is 256 kilobytes, so that’s how big each arena is. The arena is like the page you see when you open the book.

03:11 Arenas are made up of pools, where each pool is 4 kilobytes in size. Inside a pool are blocks, which can be allocated to store data. Pools each have a size class, which states how big each block inside can be. For example, this pool on the left has eight blocks, each of which stores roughly 500 bytes of memory.

03:35 The pool on the right is of a higher size class, meaning it stores fewer blocks but of a larger size.

03:44 What pool is used for allocation depends on how big the memory request is. This chart sums it up nicely. You can see there are 8-byte pools, 16-byte pools, 24-byte pools, and so on.

03:58 If a request for between 1 and 8 bytes of memory is needed, a block in the 8-byte pool will be allocated. If a 36 byte request comes in, a 40-byte block will be allocated. Next, we’ll take a look at how each of these components work in more detail, starting with pools.

utkarshsteve on Dec. 2, 2021

At 02:06 This object allocator gets called every time a new object needs space allocated. Because most memory allocation involves dealing with small amounts of data—for example, changing a string variable or adding a new integer to a list—it’s tuned to work well with small chunks of data. By changing a string variable it means we are performing some operation on the string which will result in the creation of new string in the memory? For example:

user = "Utkarsh"
user_1 = "steve"
user = user + user_1

So the memory location for

user = "Utkarsh"

and

user = user + user_1

will be different?

Bartosz Zaczyński RP Team on Dec. 3, 2021

@utkarshsteve Strings are immutable in Python, which means that whenever you try modifying their internal state, you’re in fact creating brand new strings. Concatenation is an example of such a mutation resulting in a new string object. You can confirm this by checking the identities of both objects referenced by your variables, which translate to their memory addresses:

>>> user is user_1
False

>>> id(user)
140487553583216

>>> id(user_1)
140487554952944

Become a Member to join the conversation.