Choose Your Data Alignment
00:00 Choose Your Data Alignment. Although each border pattern can be defined by as little as four bits, because there are sixteen different combinations of border sides, you’ll need to use the whole byte with eight bits to represent a border.
00:14 That’s because a single byte is the smallest unit of information in the digital world. It’s the same story with your square’s role, which would require only three bits to describe the seven unique values.
00:26 In total, you really only need seven bits per square to faithfully represent all possible combinations of borders and roles in your maze. Now, it’s entirely up to you how you want to align data comprising the square’s border and role. If you don’t mind being a little wasteful, then one way would be to keep those two pieces of information as a binary word made up of two consecutive bytes. However, that would effectively allocate more than twice the storage space necessary.
00:55 A far better option is to fit both numbers, which correspond to borders and roles, on a single byte. You can do this by shifting the bits of one number, a few places to the left and computing their bitwise union with the other number.
01:23 Here, the role number’s three bits are followed by the four bits of the border number.
01:34 When you look at the resulting bit pattern from its right edge, you see that the first four bits are the same as the bits of the border. The next three bits to the left come from the role. Finally, the most significant bit remains unused, so its value will always be zero.
01:55
Despite using plain integers in this demonstration example, you’ll get an identical result when you place them with your Border
and Role
enumeration members.
02:32
One crucial difference is that you’ll have to explicitly refer to the .value
attribute of the border or cast the border to int()
.
02:42
Otherwise, Python would interpret this last line of code as an attempt to make a compound Border
member instead of an integer number. The underlying enum.IntFlag
data type overwrites the bitwise OR
operator to mean something other than its usual definition.
03:02 The combined border and role is your square value, which you’ll keep in the array of numbers in the file body. You can decipher the border by superimposing a bitmask on top of the square value to isolate specific bits.
03:17
To get the role back, you’ll use the bitwise right-shift operator (>>
), which brings the bit shifted earlier back to their original position.
03:32 Here the bitmask lets you extract the four least significant bits from a number while disregarding the bits to the left.
03:52 Note that you can express the same bitmask in different number systems. It’s customary to use the hexadecimal system because it’s usually the most compact. You can see the binary, hexadecimal, and decimal equivalent of the mask on-screen.
04:09 Finally, you can pass the extracted numeric values to the relevant class constructors of your enumeration types.
04:25 You’ve just defined a custom binary file format that will allow you to share your mazes with friends or keep them on disk to yourself. As you can see, there’s a lot to consider when designing a custom binary file format.
04:37
Fortunately, all of that is behind you now, and it’s time to start digging into the implementation. Make a new Python package called persistence/
with the two files seen on-screen inside it.
04:50
The file_format
module will contain your file header and its body, while the serializer
module will provide the loading and saving routines.
05:01 Next, you’ll need to define Python classes to represent the file header and body, and that’s what you’ll be doing in the next part of the course.
Become a Member to join the conversation.