Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

This lesson is for members only. Join us and get access to hundreds of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Hint: You can adjust the default video playback speed in your account settings.
Hint: You can set the default subtitles language in your account settings.
Sorry! Looks like there’s an issue with video playback 🙁 This might be due to a temporary outage or because of a configuration issue with your browser. Please see our video player troubleshooting guide to resolve the issue.

collections.Counter

In this lesson, you’ll learn about the collections.Counter class. This class is very useful to count the frequency of elements in an iterable. It’s a subclass of dictionary, which makes it simpler to use. Here’s an example:

>>>
>>> from collections import Counter
>>> count = Counter("hello")
Counter({'h': 1, 'e': 1, 'l': 2, 'o': 1})
>>> count['l']
2

You can read more about Counter in the documentation for the Python collections module.

00:00 Now that you’ve learned the defaultdict, let’s learn some more useful data structures that are all found in the collections module. Let’s start with the Counter class.

00:09 So, let’s try to solve a question not using the Counter class and see how it simplifies it drastically. Here’s a function top_three_letters(), which takes in a string, and given the string, finds the top three most frequent letters. This method should return a list of tuples, where the tuple contains the character and the count. So, here’s our doctest.

00:29 top_three_letters("abbccc")'c' appears the most, and it appears 3 times, 'b' appears the second most with 2, and 'a', 1.

00:38 top_three_letters("aabbccd"), will return a list of [('a', 2), ('b', 2), ('c', 2)], because those are the top three most frequent letters.

00:48 So, the main idea is going to be # loop through the string and store the count for each letter. So, it’s actually useful in an interview to write down your thought process in comments, just so when you’re coding, you can reference the comment without having to remember it all in your head.

01:05 We’re going to first `# loop through the string and store the count for each letter. Then, # sort the dictionary by the count and find the top three

01:14 most frequent letters, then, # return a formatted list to match the output. Cool. Let’s do counter = a dictionary, and then for c in string: counter[c] += 1. Well, this is not going to quite work because if c does not already exist in counter, this will error. So, let’s actually use a default dictionary…

01:44 and then set counter to a defaultdict with int(), which will default to 0, because int() is a function that when called will return 0, so our default will be 0.

01:55 Let’s just print out what counter is to make sure that this worked correctly. Run our doctest.

02:04 go like this, wrap the comments, and see Got: defaultdict […] {'a': 2, 'b': 2, 'c': 2, 'd': 1}. So, this code worked correctly.

02:13 Obviously, it didn’t match the output, but that’s okay. We did step one. Then, let’s sort the dictionary by the count and find the top three. You can use the sorted() method—sorted(counter), which is going to sort the counter, and then we have to pass in a key because you want to sort by the count.

02:32 sorted(), by default, will sort it by the keys. The count is the value, so we need to define our own lambda. The lambda will take in the key and then return the counter[k], which will be the count.

02:47 So now, this is going to sort by the count. Let’s print this out.

02:54 Got: ['d', 'a', 'b', 'c']. So, it looked like it actually sorted in the reverse order of what we want. So, sorted() takes in reverse=True. Save, print.

03:06 Okay, so now it’s ['a', 'b', 'c', 'd']. ['c', 'b', 'a'], which is what we want in the final output. So basically, we can save this in a variable, top_three = sorted(), and then slice only the top three.

03:20 I think there’s an extra parenthesis there. Now, we did step one, step two. Now, # return a formatted list to match the output. So let’s do result = a list. for letter in top_three result.append() a tuple with the letter and the count, which is counter[letter]. Return result.

03:47 Cool! It passed. So, I’m just going to drag this so you can see it all on one line, make this a little bit cleaner.

03:58 Maybe something like this: sorted(counter, key, reverse=True), slice to 3. Everything passes, and then you can actually make this cleaner with a list comprehension. So, this for letter in top_three.

04:19 And it passes. So, that’s pretty nice. We have sort of a three-step process and used a lot of the tools that we’ve learned so far: defaultdict, sorted(), and list comprehensions.

04:31 Let’s do this all in one line using the built-in Counter class: from collections import Counter. So, the Counter class is a dict subclass, so it inherits from the dict class for counting hashable items.

04:44 You pass in some iterable with hashable items, and it will count how many times each hashable item appears, and it will be stored in a dictionary. So, if you do something like print(Counter(string))

05:02 drag this a little bit—you get a Counter class, which is like a dictionary, and then it returns 'a' count 2, 'b' count 2, 'c' count 2, 'd' count 1.

05:10 It’s very similar to these three lines of code, here. But there are also some cool built-in methods in the Counter class, like .most_common(3).

05:22 This worked because .most_common(3) actually outputs something exactly like this format. So, obviously this example isn’t exactly what you might see where one line using Counter would work, but it’s nice to show that this part of the code will do this, and this part will do this.

05:40 I’ll link some documentation for the Counter class down below. In the next video, you’ll learn about another useful data structure called the deque.

efimius on April 26, 2020

never include links the links under the tutorials.

the other issue I see there is no CC, so I can’t recommend you courses to my wife who is a hard-of-hearing person.

Besides these two issues you and RealPython do a perfect job. I like you guys a lot and always share links on your courses/articles with my colleagues.

James Uejio RP Team on April 27, 2020

Hi @efimius,

Sorry about that, we were having issues uploading the links. In the mean time I will just add a comment with the links.

I am not sure what the best way is to add CC, but I have sent your comment to Dan!

Thank you I am glad they are helpful!

James Uejio RP Team on April 27, 2020

Here is the Python documentation on Counter: Python collections module

Dan Bader RP Team on April 28, 2020

@efimius Thanks for the feedback. We’re working on bringing subtitles/captions to the platform this year, stay tuned :)

Ricardo J Lima on May 10, 2020

At this point, something is intriguing to me. I don’t consider myself in an advanced level in python (maybe somewhere intermediate), but one of the first things I learned after the beginner level was Counter. And now, it seems that is presented as something new to someone who is going to be in an interview… That’s the intriguing part> I would assume that at this point, this someone not only should know Counter but to work with its possibilities. The remarks on Counter serve for Sets and DefaultDict, Then what am I missing? What seems new to me (and very much valuable) is the workings behind these functions, what also leads me to more admiration for python and for this course.

James Uejio RP Team on May 15, 2020

Hi Ricardo! Thanks for the comment. You would be surprised how many people don’t know about the Counter class (I didn’t know until I went through some practice interview solutions). But it is a great segue into Sets/DefaultDict. I’m glad you learned something from the course even though you already knew the Counter class.

Ricardo J Lima on May 16, 2020

Yes, I reached the end of the course and it was extremely valuable! Thanks a lot for your work.

michaljakubiak on Aug. 17, 2020

Without usage of Counter:

def most_common_three(letters: str):
    return sorted({l: letters.count(l) for l in letters}.items(),
                  key=lambda x: x[1], reverse=True)[:3]

print(most_common_three("afrgrtngiqepufhpi2uhfp3o"))
Output:
[('f', 3), ('p', 3), ('r', 2)]

Become a Member to join the conversation.