Choosing an Array
There are a number of built-in data structures you can choose from when it comes to implementing arrays in Python. In this section, you’ve focused on core language features and data structures included in the standard library.
If you’re willing to go beyond the Python standard library, then third-party packages like NumPy and pandas offer a wide range of fast array implementations for scientific computing and data science.
In this section you learned:
- To store arbitrary objects, potentially with mixed data types use a
list
or atuple
- When you need mutability choose a
list
- For numeric data where memory and performance is important select
array.array
- For textual data represented as Unicode characters use the built-in
str
- For a mutable string-like data structure use a
list
of characters - For storing a contiguous block of bytes use immutable
bytes
type or abytearray
In most cases, you should start out with a simple list
. You’ll only need to specialize later on if performance or storage space becomes an issue. Most of the time, using a general-purpose array data structure like list
gives you the fastest development speed and the most programming convenience.
Here are resources for further documentation about arrays:
- Built-in Types: Lists | Python Documentation
- Built-in Types: Tuples | Python Documentation
- collections – Container Datatypes: namedtuple() | Python Documentation
- array – Efficient arrays of numeric values | Python Documentation
- Built-in Types: Bytes Objects | Python Documentation
- Built-in Types: Bytearray Objects | Python Documentation
Here are additional Real Python resources about arrays:
- Look Ma, No For-Loops: Array Programming With NumPy - Real Python Article
- The Pandas DataFrame: Make Working With Data Delightful - Real Python Article
- Unicode in Python: Working With Character Encodings - Real Python Course
- Lists and Tuples in Python - Real Python Course
- Strings and Character Data in Python - Real Python Course
Congratulations, you made it to the end of the course! What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment in the discussion section and let us know.
00:00
In the previous lesson, I described the binary array mechanisms, bytes
and bytearray
objects. In this lesson, I’ll help you choose between the array technologies and give you some further points for investigation.
00:14
If you’re going to be storing arbitrary objects that aren’t of the same type, then you need to use either a list
or a tuple
inside of Python.
00:22
If it’s going to be numbers and you need memory efficiency, then the array
library array
object is a good choice. For texts, you’re going to be using str
(string) and lists of str
, which as of Python 3 is all Unicode-based.
00:37
And if you’re dealing with immutable binary data, the bytes
object is your go-to, or bytearray
if you need mutable data. Outside of the standard library, two very useful packages are NumPy and pandas.
00:50 They both specialize in dealing with large amounts of data, so if you’re doing a bigger project that has to be able to crunch a lot of numbers, then NumPy or pandas have array types that may be of use to you.
01:02 If you’re not sure how to choose or what to do, then really you should just start with a list. Stick with that until you know you need some sort of performance change.
01:11 The list will get you there for most of your cases.
01:15
For further information on the built-in types and the standard library, you can always look at the Python documentation. Here’s the link for lists, tuples, the namedtuple
factory, the array
library, bytes
, and the bytearray
.
01:33 If you’d like to learn more about Unicode and how those character encodings work, there’s a course available for you at this link. If you’re hungry for dealing with larger amounts of data, then this article on how to deal with pandas is a good first step.
01:47 Or if you’d prefer to try NumPy instead, there is a good link here as well.
01:54 Thanks for continuing to use Real Python! I appreciate your attention. I hope the course was useful for you.
Robert Curtiss on Feb. 26, 2021
Excellent, to the point , and I learned a lot, thank you
Cheers :)
ricardoaparicio92 on Feb. 28, 2021
Great video. Maybe some extra information on the time complexity of the data structures would have been nice. I believe is was only mentioned once.
Christopher Trudeau RP Team on March 1, 2021
Hi Ricardo,
Thanks for the feedback. It is always a balancing act with courses like this, some of our audience isn’t familiar with O-notation. The course (and its coming sequels) is based on the following article:
realpython.com/python-data-structures/
You may find a bit more about the time complexity data you’re interested in there.
Bartosz Zaczyński RP Team on March 2, 2021
@ricardoaparicio92 To add my two cents, you might be interested in reading through the time-space complexity analysis of the Binary Search Algorithm. While it doesn’t specifically talk about data structures, it’ll let you understand the concept so that you can estimate one yourself.
Adrian on March 25, 2021
Great balance between the code you should know and why things are the way they are.
mikehillsnc on July 26, 2022
I’m a newbie and found this very helpful. One point that wasn’t answered is whether sorting and lookup are case insensitive.
Bartosz Zaczyński RP Team on July 26, 2022
@mikehillsnc Regardless of which sequence type you use in Python, sorting and lookup are case-sensitive operations.
Become a Member to join the conversation.
RobyB on Feb. 24, 2021
Thank you!