Binary Arrays

Dictionaries and Arrays: Selecting the Ideal Data Structure Christopher Trudeau 04:47

00:00 In the previous lesson, I spoke about typed arrays and strings. In this lesson, I’ll be showing you how to deal with binary data using binary arrays. Python provides two array types for dealing with binary data, the first of which is called bytes.

00:17 A bytes object is an immutable sequence of integers in the range of 0 to 255. That’s the capacity of a single byte. The mutable version of the byte array is called bytearray.

00:30 This is similar to the concept of using strings and lists of strings shown in the previous lesson. The mutable bytes type is like a string, and the bytearray is like that list.

00:42 You can convert back and forth between the bytearray and bytes types, but it’s a slow process because you’re actually copying each individual byte stored within the object.

00:54 Let me start by creating a bytes object.

01:03 This represents the first 10 bytes of a GIF—or “GIF”, depending on your pronunciation—image. Notice that I’ve passed in a tuple to the bytes object, specifying the values.

01:16 I’ve specified the values in hex, because that’s a common format to use when you’re dealing with binary data, but anything that translates into an integer between 0 and 255 would work.

01:27 I can use subscripts to get at a specific byte within the bytes object. Here’s the byte at position 3, which is the fourth item, 0x38, which when printed out is shown in decimal for a value of 56.

01:43 The gif_header object implements the .__repr__() method, so you can see a representation of it inside of the REPL. The b in front of this quote indicates that this is binary data for Python. The GIF89a is because the first six characters are within readable ASCII range, so Python prints those as the ASCII values rather than the hexadecimal.

02:07 The remainder chunk are not printable characters in ASCII—00 and 01 aren’t printable ASCII—so Python shows it as hexadecimal. As I mentioned before, a byte inside of the bytes object has to be in the range of 0 to 255.

02:25 If you try to do something larger than that, you’ll get a ValueError. In the error here, it says range(0, 256). That’s because the Python range() function is not inclusive, so that is 0 to 255 included. 256 is outside the range. bytes objects are immutable, so if you try to assign something to them, you’ll get an error.

02:53 Likewise, if you attempt to delete. The bytearray object is the mutable version of a byte representation. Let me create a new bytearray.

03:11 And this time, at least there’s no argument over how to pronounce it. This is the JPG header, the first 11 bytes of the binary representation of a JPG image.

03:22 bytearray also implements .__repr__(), so you can look at the contents. Just like the bytes object, the ASCII values are printed in ASCII and everything else is shown in hex.

03:35 Subscripting works. And because it’s a bytearray and it’s mutable, you can actually change these values.

03:48 The position just before the capital J has changed from \x10 to \x1f.

03:57 I can delete.

04:04 And I can also append. I’ve put that null character back on the end of this bytearray. Just like the bytes object, there’s a limit.

04:14 The contents of each byte has to be in the range of 0 to 255.

04:22 Trying to append 256 results in a ValueError. I can convert my bytearray object to a bytes object by simply constructing a bytes object and passing in the bytearray.

04:38 In the next lesson, I’ll wrap up and give you a summary of the array structures and how to select from the choices for your own code.

Become a Member to join the conversation.