Accessing Shared Memory
00:00 In the previous lesson, I showed you how to manipulate the bytes inside of a memory-mapped block. In this lesson, you’ll learn how to share memory between two operating system processes.
00:11 By default, each of your processes is given a private chunk of memory for its own use by the operating system. If you want to share memory between processes, you can make a call to the OS to get this done.
00:24
mmap is one of the ways to do this. The call to create the block is similar to what you’ve seen so far, but this time you use -1
for the file number, and you’ll get an anonymous block that isn’t associated with a file.
00:38
Let’s go see this in practice. In the top window here, I have some code that creates two processes. The key to this is the call to the os.fork()
function on line 8.
00:51 When the code gets to this point, the operating system creates an exact copy of the running process, and both copies continue to execute from that point in the code.
01:02
That’d be where line 8 returns. The parent process gets a process ID of 0
back, while the child process will get a process ID given by the operating system. The if
… else
block starting on line 10 then changes the behavior of the program based on the process ID.
01:21 The parent will only run lines 11 through 13, while the child will only run the lines 15 through 17. As I have multiple CPUs, it is possible that these two processes will run at the exact same time.
01:35 This means you can get into all sorts of trouble with a shared resource like the screen. As this isn’t a course on race conditions, I’ve put a two-second delay on line 15 to ensure that the child process doesn’t do anything before the parent process is done. Okay, so you’ve got two processes.
01:54
The part I skipped at the top on line 6 is how you get a shared block of memory. The mmap
object created on line 6 is a shared memory block a hundred bytes long that can be written to. As the child process gets a complete copy of the parent’s environment, both the parent and the child will have a reference to this shared block of memory. The parent process will print out the contents of the shared block, then writes 100 "a"
s to the block.
02:23 The child process prints out what sees in the block.
02:28 You ready for this? Let’s run it.
02:33 There’s the message from the parent process. And there is the child process. Note that the block started out initialized with zeros. In olden times, this wasn’t guaranteed depending on the OS, but that could cause some security issues, as you might be getting a peek at someone else’s data.
02:50 I’m pretty sure this is no longer an issue, but you might want to double check on your own system before making the assumption.
02:57
After printing the one hundred bytes of 0
, their parent process writes the a
s. Two seconds after the nap, the child process wakes up, announces that it is process 60127, and then prints out the current contents of the block: aaaaaaa
!
03:14 suddenly I’m expecting a tongue depressor.
03:20
There’s more than one way of getting shared memory in Python. Let’s discuss the pros and cons of using mmap. Starting with the bad news: you pretty much have to use the os.fork()
method that I just showed you if you want to use mmap to share memory.
03:34 There are higher-level libraries that make process management easier, but they don’t work with mmap, so you’re stuck with this one. On the other hand, those same higher-level libraries come with their own handcuffs.
03:46 The multiprocessing library is much more Pythonic than fork, but it also expects that everything you are sharing is serializable using Python’s pickle methods. Depending on your data, that might be a problem.
03:59
So you can always go the old style: use os.fork
and mmap to share if that restriction is problematic.
04:09 Well that’s the highlights of mmap. Join me in the last lesson, where I summarize the course and point you at some places where you can learn some more.
Become a Member to join the conversation.