# Sets

In this lesson, you’ll learn about sets. Sets are unsorted collections that allows for adding, removing, and checking membership in constant time. They do not allow duplicates. Here’s an example:

Python
``````>>> s = set()
>>> s
{'h', 'i'} # Could appear in any order
>>> set("hello")
{'e', 'h', 'l', 'o'}
``````
Copied!

You also heard about Big O analysis, which is a way to analyze the speed and memory usage of a function or block of code. See Wikipedia Time complexity for more information.

James Uejio RP Team

If you want to learn more, here is a Real Python walkthrough video on sets: Sets in Python

I also talk about Big-O analysis which is a way to analyze the speed and memory usage of a function or block of code. See Wikipedia Time complexity for more information.

drawdoowmij

Great information on Sets James – thanks!

Abhishek

Nice lesson James.

I have a question. Wouldn’t `seen_c` have an upper bound at 26, meaning all characters are already encountered at least once and the runtime would then be `26 * n` which is essentially O(n)?

Thanks for your efforts, learning a lot!

James Uejio RP Team

Hi @Abhishek you’re absolutely right, I can’t believe I missed that. I’m so used to dealing with numbers that it slipped my mind. If you for example had a list of numbers, then it would be O(n^2) but with only letters, you’re right it would be O(n)

Gino Mempin

It might be good to show timeit outputs for simple examples:

``````In [4]: def count_unique_1(s):
...:     seen_c = []
...:     for c in s:
...:         if c not in seen_c:
...:             seen_c.append(c)
...:     return len(seen_c)

In [5]: def count_unique_2(s):
...:     seen_c = set()
...:     for c in s:
...:         if c not in seen_c:
...:     return len(seen_c)

In [6]: def count_unique_3(s):
...:     return len({c for c in s})

In [7]: %%timeit
...: count_unique_1("abcdef"*1000)
...:
448 µs ± 1.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: %%timeit
...: count_unique_2("abcdef"*1000)
...:
184 µs ± 1.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [9]: %%timeit
...: count_unique_3("abcdef"*1000)
...:
139 µs ± 458 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
``````

James Uejio RP Team

Hi Gino, thank you for posting that great example! Hopefully people will see it.

nhojman

Thanks for a great tutorial James!

A short note, wouldn’t it be an even cleaner approach to write the below instead of the set comprehension?

``````def count_unique(s):
return len(set(s))
``````

nhojman

Please excuse my previous comment - I didn’t watch the video to the end. My bad!

James Uejio RP Team

No problem Nick! Glad you’re thinking ahead. Thanks for kind words :).

avinashk2

Why above `timeit` comment gave slower time for `set` than `list`? I mean count `unique1` is faster than count `unique2`.

James Uejio RP Team

@avinashk2 I believe count_unique_1 is 450 µs per loop and count_unique_2 is 184 µs per loop so count_unique_1 is slower.

to join the conversation.