Using the len() Function in Python

Using the len() Function in Python

by Stephen Gruppetta Oct 20, 2021 basics python

In many situations, you’ll need to find the number of items stored in a data structure. Python’s built-in function len() is the tool that will help you with this task.

There are some cases in which the use of len() is straightforward. However, there are other times when you’ll need to understand how this function works in more detail and how to apply it to different data types.

In this tutorial, you’ll learn how to:

  • Find the length of built-in data types using len()
  • Use len() with third-party data types
  • Provide support for len() with user-defined classes

By the end of this article, you’ll know when to use the len() Python function and how to use it effectively. You’ll know which built-in data types are valid arguments for len() and which ones you can’t use. You’ll also understand how to use len() with third-party types, such as ndarray in NumPy and DataFrame in pandas, and with your own classes.

Getting Started With Python’s len()

The function len() is one of Python’s built-in functions. It returns the length of an object. For example, it can return the number of items in a list. You can use the function with many different data types. However, not all data types are valid arguments for len().

You can start by looking at the help for this function:

>>>
>>> help(len)
Help on built-in function len in module builtins:
len(obj, /)
    Return the number of items in a container.

The function takes an object as an argument and returns the length of that object. The documentation for len() goes a bit further:

Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set). (Source)

When you use built-in data types and many third-party types with len(), the function doesn’t need to iterate through the data structure. The length of a container object is stored as an attribute of the object. The value of this attribute is modified each time items are added to or removed from the data structure, and len() returns the value of the length attribute. This ensures that len() works efficiently.

In the following sections, you’ll learn about how to use len() with sequences and collections. You’ll also learn about some data types that you cannot use as arguments for the len() Python function.

Using len() With Built-in Sequences

A sequence is a container with ordered items. Lists, tuples, and strings are three of the basic built-in sequences in Python. You can find the length of a sequence by calling len():

>>>
>>> greeting = "Good Day!"
>>> len(greeting)
9

>>> office_days = ["Tuesday", "Thursday", "Friday"]
>>> len(office_days)
3

>>> london_coordinates = (51.50722, -0.1275)
>>> len(london_coordinates)
2

When finding the length of the string greeting, the list office_days, and the tuple london_coordinates, you use len() in the same manner. All three data types are valid arguments for len().

The function len() always returns an integer as it’s counting the number of items in the object that you pass to it. The function returns 0 if the argument is an empty sequence:

>>>
>>> len("")
0
>>> len([])
0
>>> len(())
0

In the examples above, you find the length of an empty string, an empty list, and an empty tuple. The function returns 0 in each case.

A range object is also a sequence that you can create using range(). A range object doesn’t store all the values but generates them when they’re needed. However, you can still find the length of a range object using len():

>>>
>>> len(range(1, 20, 2))
10

This range of numbers includes the integers from 1 to 19 with increments of 2. The length of a range object can be determined from the start, stop, and step values.

In this section, you’ve used the len() Python function with strings, lists, tuples, and range objects. However, you can also use the function with any other built-in sequence.

Using len() With Built-in Collections

At some point, you may need to find the number of unique items in a list or another sequence. You can use sets and len() to achieve this:

>>>
>>> import random

>>> numbers = [random.randint(1, 20) for _ in range(20)]

>>> numbers
[3, 8, 19, 1, 17, 14, 6, 19, 14, 7, 6, 1, 17, 10, 8, 14, 17, 10, 2, 5]

>>> unique_numbers = set(numbers)
{1, 2, 3, 5, 6, 7, 8, 10, 14, 17, 19}

>>> len(unique_numbers)
11

You generate the list numbers using a list comprehension, and it contains twenty random numbers ranging between 1 and 20. The output will be different each time the code runs since you’re generating random numbers. In this particular run, there are eleven unique numbers in the list of twenty randomly generated numbers.

Another built-in data type that you’ll use often is the dictionary. In a dictionary, each item consists of a key-value pair. When you use a dictionary as an argument for len(), the function returns the number of items in the dictionary:

>>>
>>> len({"James": 10, "Mary": 12, "Robert": 11})
3

>>> len({})
0

The output from the first example shows that there are three key-value pairs in this dictionary. As was the case with sequences, len() will return 0 when the argument is either an empty dictionary or an empty set. This leads to empty dictionaries and empty sets being falsy.

Exploring len() With Other Built-in Data Types

You can’t use all built-in data types as arguments for len(). For data types that don’t store more than one item within them, the concept of length isn’t relevant. This is the case with numbers and Boolean types:

>>>
>>> len(5)
Traceback (most recent call last):
    ...
TypeError: object of type 'int' has no len()

>>> len(5.5)
Traceback (most recent call last):
     ...
TypeError: object of type 'float' has no len()

>>> len(True)
Traceback (most recent call last):
     ...
TypeError: object of type 'bool' has no len()

>>> len(5 + 2j)
Traceback (most recent call last):
     ...
TypeError: object of type 'complex' has no len()

The integer, float, Boolean, and complex types are examples of built-in data types that you can’t use with len(). The function raises a TypeError when the argument is an object of a data type that doesn’t have a length.

You can also explore whether it’s possible to use iterators and generators as arguments for len():

>>>
>>> import random

>>> numbers = [random.randint(1, 20) for _ in range(20)]
>>> len(numbers)
20

>>> numbers_iterator = iter(numbers)
>>> len(numbers_iterator)
Traceback (most recent call last):
     ...
TypeError: object of type 'list_iterator' has no len()

>>> numbers_generator = (random.randint(1, 20) for _ in range(20))
>>> len(numbers_generator)
Traceback (most recent call last):
     ...
TypeError: object of type 'generator' has no len()

You’ve already seen that a list has a length, meaning you can use it as an argument in len(). You create an iterator from the list using the built-in function iter(). In an iterator, each item is fetched whenever it’s required, such as when the function next() is used or in a loop. However, you can’t use an iterator in len().

You get a TypeError when you try to use an iterator as an argument for len(). As the iterator fetches each item as and when it’s needed, the only way to measure its length is to exhaust the iterator. An iterator can also be infinite, such as the iterator returned by itertools.cycle(), and therefore its length can’t be defined.

You can’t use generators with len() for the same reason. The length of these objects can’t be measured without using them up.

Exploring len() Further With Some Examples

In this section, you’ll learn about some common use cases for len(). These examples will help you understand better when to use this function and how to use it effectively. In some of the examples, you’ll also see cases where len() is a possible solution but there may be more Pythonic ways of achieving the same output.

Verifying the Length of a User Input

A common use case of len() is to verify the length of a sequence input by a user:

# username.py

username = input("Choose a username: [4-10 characters] ")

if 4 <= len(username) <= 10:
    print(f"Thank you. The username {username} is valid")
else:
    print("The username must be between 4 and 10 characters long")

In this example, you use an if statement to check if the integer returned by len() is greater than or equal to 4 and less than or equal to 10. You can run this script and you’ll get an output similar to the one below:

$ python username.py
Choose a username: [4-10 characters] stephen_g
Thank you. The username stephen_g is valid

The username is nine characters long in this case, so the condition in the if statement evaluates to True. You can run the script again and input an invalid username:

$ python username.py
Choose a username: [4-10 characters] sg
The username must be between 4 and 10 characters long

In this case, len(username) returns 2, and the condition in the if statement evaluates to False.

Ending a Loop Based on the Length of an Object

You’ll use len() if you need to check when the length of a mutable sequence, such as a list, reaches a specific number. In the following example, you ask the user to enter three username options, which you store in a list:

# username.py

usernames = []

print("Enter three options for your username")

while len(usernames) < 3:
    username = input("Choose a username: [4-10 characters] ")
    if 4 <= len(username) <= 10:
        print(f"Thank you. The username {username} is valid")
        usernames.append(username)
    else:
        print("The username must be between 4 and 10 characters long")

print(usernames)

You’re now using the result from len() in the while statement. If the user enters an invalid username, you don’t keep the input. When the user enters a valid string, you append it to the list usernames. The loop repeats until there are three items in the list.

You could even use len() to check when a sequence is empty:

>>>
>>> colors = ["red", "green", "blue", "yellow", "pink"]

>>> while len(colors) > 0:
...     print(f"The next color is {colors.pop(0)}")
...
The next color is red
The next color is green
The next color is blue
The next color is yellow
The next color is pink

You use the list method .pop() to remove the first item from the list in each iteration until the list is empty. If you’re using this method on large lists, you should remove items from the end of the list as this is more efficient. You can also use the deque data type from the collections built-in module, which allows you to pop from the left efficiently.

There’s a more Pythonic way of achieving the same output by using the truthiness of sequences:

>>>
>>> colors = ["red", "green", "blue", "yellow", "pink"]

>>> while colors:
...    print(f"The next color is {colors.pop(0)}")
...
The next color is red
The next color is green
The next color is blue
The next color is yellow
The next color is pink

An empty list is falsy. This means that the while statement interprets an empty list as False. A non-empty list is truthy, and the while statement treats it as True. The value returned by len() determines the truthiness of a sequence. A sequence is truthy when len() returns any non-zero integer and falsy when len() returns 0.

Finding the Index of the Last Item of a Sequence

Imagine you want to generate a sequence of random numbers in the range 1 to 10 and you’d like to keep adding numbers to the sequence until the sum of all the numbers exceeds 21. The following code creates an empty list and uses a while loop to populate the list:

>>>
>>> import random

>>> numbers = []
>>> while sum(numbers) <= 21:
...    numbers.append(random.randint(1, 10))

>>> numbers
[3, 10, 4, 7]

>>> numbers[len(numbers) - 1]
7

>>> numbers[-1]  # A more Pythonic way to retrieve the last item
7

>>> numbers.pop(len(numbers) - 1)  # You can use numbers.pop(-1)
7

>>> numbers
[3, 10, 4]

You append random numbers to the list until the sum exceeds 21. The output you’ll get will vary as you’re generating random numbers. To display the last number in the list, you use len(numbers) and subtract 1 from it since the first index of the list is 0. Indexing in Python allows you to use the index -1 to obtain the last item in a list. Therefore, although you can use len() in this case, you don’t need to.

You want to remove the last number in the list so that the sum of all numbers in the list doesn’t exceed 21. You use len() again to work out the index of the last item in the list, which you use as an argument for the list method .pop(). Even in this instance, you could use -1 as an argument for .pop() to remove the last item from the list and return it.

Splitting a List Into Two Halves

If you need to split a sequence into two halves, you’ll need to use the index that represents the midpoint of the sequence. You can use len() to find this value. In the following example, you’ll create a list of random numbers and then split it into two smaller lists:

>>>
>>> import random

>>> numbers = [random.randint(1, 10) for _ in range(10)]
>>> numbers
[9, 1, 1, 2, 8, 10, 8, 6, 8, 5]

>>> first_half = numbers[: len(numbers) // 2]
>>> second_half = numbers[len(numbers) // 2 :]

>>> first_half
[9, 1, 1, 2, 8]
>>> second_half
[10, 8, 6, 8, 5]

In the assignment statement where you define first_half, you use the slice that represents the items from the beginning of numbers up to the midpoint. You can work out what the slice represents by breaking down the steps you use in the slice expression:

  1. First, len(numbers) returns the integer 10.
  2. Next, 10 // 2 returns the integer 5 as you use the integer division operator.
  3. Finally, 0:5 is a slice that represents the first five items, which have indices 0 to 4. Note that the endpoint is excluded.

In the next assignment, where you define second_half, you use the same expression in the slice. However, in this case, the integer 5 represents the start of the range. The slice is now 5: to represent the items from index 5 up to the end of the list.

If your original list contains an odd number of items, then half of its length will no longer be a whole number. When you use integer division, you obtain the floor of the number. The list first_half will now contain one less item than second_half.

You can try this out by creating an initial list of eleven numbers instead of ten. The resulting lists will no longer be halves, but they’ll represent the closest alternative to splitting an odd sequence.

Using the len() Function With Third-Party Libraries

You can also use Python’s len() with several custom data types from third-party libraries. In the last section of this tutorial, you’ll learn how the behavior of len() depends on the class definition. In this section, you’ll look at examples of using len() with data types from two popular third-party libraries.

NumPy’s ndarray

The NumPy module is the cornerstone of all quantitative applications of programming in Python. The module introduces the numpy.ndarray data type. This data type, along with functions within NumPy, is ideally suited for numerical computations and is the building block for data types in other modules.

Before you can start using NumPy, you’ll need to install the library. You can use Python’s standard package manager, pip, and run the following command in the console:

$ python -m pip install numpy

You’ve installed NumPy, and now you can create a NumPy array from a list and use len() on the array:

>>>
>>> import numpy as np

>>> numbers = np.array([4, 7, 9, 23, 10, 6])
>>> type(numbers)
<class 'numpy.ndarray'>

>>> len(numbers)
6

The NumPy function np.array() creates an object of type numpy.ndarray from the list you pass as an argument.

However, NumPy arrays can have more than one dimension. You can create a two-dimensional array by converting a list of lists into an array:

>>>
>>> import numpy as np

>>> numbers = [
    [11, 1, 10, 10, 15],
    [14, 9, 16, 4, 4],
]

>>> numbers_array = np.array(numbers)
>>> numbers_array
array([[11,  1, 10, 10, 15],
       [14,  9, 16,  4,  4]])

>>> len(numbers_array)
2

>>> numbers_array.shape
(2, 5)

>>> len(numbers_array.shape)
2

>>> numbers_array.ndim
2

The list numbers consists of two lists, each containing five integers. When you use this list of lists to create a NumPy array, the result is an array with two rows and five columns. The function returns the number of rows in the array when you pass this two-dimensional array as an argument in len().

To get the size of both dimensions, you use the property .shape, which is a tuple showing the number of rows and columns. You obtain the number of dimensions of a NumPy array either by using .shape and len() or by using the property .ndim.

In general, when you have an array with any number of dimensions, len() returns the size of the first dimension:

>>>
>>> import numpy as np

>>> array_3d = np.random.randint(1, 20, [2, 3, 4])
>>> array_3d
array([[[14,  9, 15, 14],
        [17, 11, 10,  5],
        [18,  1,  3, 12]],
       [[ 1,  5,  6, 10],
        [ 6,  3,  1, 12],
        [ 1,  4,  4, 17]]])

>>> array_3d.shape
(2, 3, 4)

>>> len(array_3d)
2

In this example, you create a three-dimensional array with the shape (2, 3, 4) where each element is a random integer between 1 and 20. You use the function np.random.randint() to create an array this time. The function len() returns 2, which is the size of the first dimension.

Check out NumPy Tutorial: Your First Steps Into Data Science in Python to learn more about using NumPy arrays.

Pandas’ DataFrame

The DataFrame type in the pandas library is another data type that is used extensively in many applications.

Before you can use pandas, you’ll need to install it by using the following command in the console:

$ python -m pip install pandas

You’ve installed the pandas package, and now you can create a DataFrame from a dictionary:

>>>
>>> import pandas as pd

>>> marks = {
    "Robert": [60, 75, 90],
    "Mary": [78, 55, 87],
    "Kate": [47, 96, 85],
    "John": [68, 88, 69],
}

>>> marks_df = pd.DataFrame(marks, index=["Physics", "Math", "English"])

>>> marks_df
         Robert  Mary  Kate  John
Physics      60    78    47    68
Math         75    55    96    88
English      90    87    85    69

>>> len(marks_df)
3

>>> marks_df.shape
(3, 4)

The dictionary’s keys are strings representing the names of students in a class. The value of each key is a list with the marks for three subjects. When you create a DataFrame from this dictionary, you define the index using a list containing the subject names.

The DataFrame has three rows and four columns. The function len() returns the number of rows in the DataFrame. The DataFrame type also has a .shape property, which you can use to show that the first dimension of a DataFrame represents the number of rows.

You’ve seen how len() works with a number of built-in data types and also with some data types from third-party modules. In the following section, you’ll learn how to define any class so that it’s usable as an argument for the len() Python function.

You can explore the pandas module further in The Pandas DataFrame: Make Working With Data Delightful.

Using len() on User-Defined Classes

When you define a class, one of the special methods you can define is .__len__(). These special methods are called dunder methods as they have double underscores at the beginning and end of the method names. Python’s built-in len() function calls its argument’s .__len__() method.

In the previous section, you’ve seen how len() behaves when the argument is a pandas DataFrame object. This behavior is determined by the .__len__() method for the DataFrame class, which you can see in the module’s source code in pandas.core.frame:

class DataFrame(NDFrame, OpsMixin):
    # ...
    def __len__(self) -> int:
        """
        Returns length of info axis, but here we use the index.
        """
        return len(self.index)

This method returns the length of the DataFrame’s .index property using len(). This dunder method defines the length of a DataFrame to be equal to the number of rows in the DataFrame as represented by .index.

You can explore the .__len__() dunder method further with the following toy example. You’ll define a class named YString. This data type is based on the built-in string class, but objects of type YString give the letter Y more importance than all the other letters:

# ystring.py

class YString(str):
    def __init__(self, text):
        super().__init__()

    def __str__(self):
        """Display string as lowercase except for Ys that are uppercase"""
        return self.lower().replace("y", "Y")

    def __len__(self):
        """Returns the number of Ys in the string"""
        return self.lower().count("y")

The .__init__() method of YString initializes the object using the .__init__() method of the parent str class. You achieve this using the function super(). The .__str__() method defines the way the object is displayed. The functions str(), print(), and format() all call this method. For this class, you represent the object as an all-lowercase string with the exception of the letter Y, which you display as uppercase.

For this toy class, you define the object’s length as the number of occurrences of the letter Y in the string. Therefore, the .__len__() method returns the count of the letter Y.

You can create an object of class YString and find its length. The module name used for the example above is ystring.py:

>>>
>>> from ystring import YString

>>> message = YString("Real Python? Yes! Start reading today to learn Python")

>>> print(message)
real pYthon? Yes! start reading todaY to learn pYthon

>>> len(message)  # Returns number of Ys in message
4

You create an object of type YString from an object of type str and show the representation of the object using print(). You then use the object message as an argument for len(). This calls the class’s .__len__() method, and the result is the number of occurrences of the letter Y in message. In this case, the letter Y appears four times.

The YString class is not a very useful one, but it helps illustrate how you can customize the behavior of len() to suit your needs. The .__len__() method must return a non-negative integer. Otherwise, it raises an error.

Another special method is the .__bool__() method, which determines how an object can be converted to a Boolean. The .__bool__() dunder method is not normally defined for sequences and collections. In these cases, the .__len__() method determines the truthiness of an object:

>>>
>>> from ystring import YString

>>> first_test = "tomorrow"
>>> second_test = "today"

>>> bool(first_test)
True

>>> bool(YString(first_test))
False

>>> bool(second_test)
True

>>> bool(YString(second_test))
True

The variable first_string doesn’t have a Y in it. As shown by the output from bool(), the string is truthy as it’s non-empty. However, when you create an object of type YString from this string, the new object is falsy as there are no Y letters in the string. Therefore, len() returns 0. In contrast, the variable second_string does include the letter Y, and so both the string and the object of type YString are truthy.

You can read more about using object-oriented programming and defining classes in Object-Oriented Programming (OOP) in Python 3.

Conclusion

You’ve explored how to use len() to determine the number of items in sequences, collections, and other data types that hold several items at a time, such as NumPy arrays and pandas DataFrames.

The len() Python function is a key tool in many programs. Some of its uses are straightforward, but there’s a lot more to this function than its most basic use cases, as you’ve seen in this tutorial. Knowing when you can use this function and how to use it effectively will help you write neater code.

In this tutorial, you’ve learned how to:

  • Find the length of built-in data types using len()
  • Use len() with third-party data types
  • Provide support for len() with user-defined classes

You now have a good foundation for understanding the len() function. Learning more about len() helps you understand the differences between data types better. You’re ready to use len() in your algorithms and to improve the functionality of some of your class definitions by enhancing them with the .__len__() method.

🐍 Python Tricks 💌

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

Python Tricks Dictionary Merge

About Stephen Gruppetta

Stephen Gruppetta Stephen Gruppetta

Stephen worked as a research physicist in the past, developing imaging systems to detect eye disease. He now teaches coding in Python to kids and adults. And he's almost finished writing his first Python coding book for beginners

» More about Stephen

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Master Real-World Python Skills With Unlimited Access to Real Python

Join us and get access to hundreds of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

Master Real-World Python Skills
With Unlimited Access to Real Python

Join us and get access to hundreds of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

What Do You Think?

Real Python Comment Policy: The most useful comments are those written with the goal of learning from or helping out other readers—after reading the whole article and all the earlier comments. Complaints and insults generally won’t make the cut here.

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Keep Learning

Related Tutorial Categories: basics python