Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Choosing an Array

There are a number of built-in data structures you can choose from when it comes to implementing arrays in Python. In this section, you’ve focused on core language features and data structures included in the standard library.

If you’re willing to go beyond the Python standard library, then third-party packages like NumPy and pandas offer a wide range of fast array implementations for scientific computing and data science.

In this section you learned:

  • To store arbitrary objects, potentially with mixed data types use a list or a tuple
  • When you need mutability choose a list
  • For numeric data where memory and performance is important select array.array
  • For textual data represented as Unicode characters use the built-in str
  • For a mutable string-like data structure use a list of characters
  • For storing a contiguous block of bytes use immutable bytes type or a bytearray

In most cases, you should start out with a simple list. You’ll only need to specialize later on if performance or storage space becomes an issue. Most of the time, using a general-purpose array data structure like list gives you the fastest development speed and the most programming convenience.

Here are resources for further documentation about arrays:

Here are additional Real Python resources about arrays:

Download

Sample Code (.zip)

4.9 KB
Download

Course Slides (.pdf)

725.1 KB

00:00 In the previous lesson, I described the binary array mechanisms, bytes and bytearray objects. In this lesson, I’ll help you choose between the array technologies and give you some further points for investigation.

00:14 If you’re going to be storing arbitrary objects that aren’t of the same type, then you need to use either a list or a tuple inside of Python.

00:22 If it’s going to be numbers and you need memory efficiency, then the array library array object is a good choice. For texts, you’re going to be using str (string) and lists of str, which as of Python 3 is all Unicode-based.

00:37 And if you’re dealing with immutable binary data, the bytes object is your go-to, or bytearray if you need mutable data. Outside of the standard library, two very useful packages are NumPy and pandas.

00:50 They both specialize in dealing with large amounts of data, so if you’re doing a bigger project that has to be able to crunch a lot of numbers, then NumPy or pandas have array types that may be of use to you.

01:02 If you’re not sure how to choose or what to do, then really you should just start with a list. Stick with that until you know you need some sort of performance change.

01:11 The list will get you there for most of your cases.

01:15 For further information on the built-in types and the standard library, you can always look at the Python documentation. Here’s the link for lists, tuples, the namedtuple factory, the array library, bytes, and the bytearray.

01:33 If you’d like to learn more about Unicode and how those character encodings work, there’s a course available for you at this link. If you’re hungry for dealing with larger amounts of data, then this article on how to deal with pandas is a good first step.

01:47 Or if you’d prefer to try NumPy instead, there is a good link here as well.

01:54 Thanks for continuing to use Real Python! I appreciate your attention. I hope the course was useful for you.

Avatar image for RobyB

RobyB on Feb. 24, 2021

Thank you!

Avatar image for Robert Curtiss

Robert Curtiss on Feb. 26, 2021

Excellent, to the point , and I learned a lot, thank you

Cheers :)

Avatar image for ricardoaparicio92

ricardoaparicio92 on Feb. 28, 2021

Great video. Maybe some extra information on the time complexity of the data structures would have been nice. I believe is was only mentioned once.

Avatar image for Christopher Trudeau

Christopher Trudeau RP Team on March 1, 2021

Hi Ricardo,

Thanks for the feedback. It is always a balancing act with courses like this, some of our audience isn’t familiar with O-notation. The course (and its coming sequels) is based on the following article:

realpython.com/python-data-structures/

You may find a bit more about the time complexity data you’re interested in there.

Avatar image for Bartosz Zaczyński

Bartosz Zaczyński RP Team on March 2, 2021

@ricardoaparicio92 To add my two cents, you might be interested in reading through the time-space complexity analysis of the Binary Search Algorithm. While it doesn’t specifically talk about data structures, it’ll let you understand the concept so that you can estimate one yourself.

Avatar image for Adrian

Adrian on March 25, 2021

Great balance between the code you should know and why things are the way they are.

Avatar image for mikehillsnc

mikehillsnc on July 26, 2022

I’m a newbie and found this very helpful. One point that wasn’t answered is whether sorting and lookup are case insensitive.

Avatar image for Bartosz Zaczyński

Bartosz Zaczyński RP Team on July 26, 2022

@mikehillsnc Regardless of which sequence type you use in Python, sorting and lookup are case-sensitive operations.

Become a Member to join the conversation.