Typed Arrays and Strings
Python provides an implementation of a typed array found inside of the
array library called
array. Because it’s typed, you have to specify what kind of information is being stored in the array at construction. The
list type, although it’s an array, isn’t very densely packed, and that’s due to the fact that any item in the list can be of a different type and therefore a different size. The
array.array is typed, and so sizing is extremely predictable and memory efficient.
'u' is a Unicode character. Notice that the size column here is the minimum size in bytes. This is because, depending on what platform you’re running on, the number of bytes to represent certain things may be larger.
32-bit versus 64-bit platforms will have different sizes for some of these types.
'H' are for signed and unsigned shorts,
'I' for signed and unsigned integers,
'L' for signed and unsigned longs,
'Q' for signed and unsigned long longs,
'f' for float, and
'd' for double. With the exception of the Unicode character and the floats, everything maps to Python’s integer type.
03:23 Why there’s so many has to do with how this is translated into C. If you’re reading in a stream of content from another program, in order to pack it into the array, you may need to know what the C type is.
03:35 This is also important if you’re working with a Python extension which is written inside of the C language. Strings themselves are actually arrays. They’re contiguous memory representations of characters. Prior to Python 3, the string was based on ASCII.
03:53 Starting with Python 3 and moving forward, it’s based on Unicode. Strings are actually immutable—they can’t be changed. Now you might be thinking to yourself, “But I do things to strings all the time!” Well, that’s actually an illusion. Anytime you make a change to a string, that entire string is being replaced with a newly-created string replacing the original. Interestingly enough, there’s no concept of a character in Python. There’s strings of length 1.
04:22 A string which is a word is comprised of an array of strings of length 1. It’s a recursive definition. Because the length of each character is well known, this kind of array is very tightly packed.
Casting the string to a list iterates over each of the characters in the string and assigns it into the list. The variable
letters is a list containing each one of the characters from the original string.
Because this is now a list, you can assign using subscripts. Everything in Python is an object, including strings. One of the methods on a string is
.join(). Using the empty string and the
.join() method on that, you can convert your list back into a string.
Become a Member to join the conversation.