Calling .split(): Default Behavior
00:00
Now let’s see what happens when you call the .split()
method in its most basic form, without giving it any arguments for separator
or maxsplit
.
00:09
When you call the .split()
method with no arguments, it splits on any sequence of white spaces. It’s not just looking for single spaces. It looks for any continuous block of white space characters.
00:20 This includes regular spaces, tabs, new lines, or returns. It treats multiple whitespace characters as a single delimiter. The spaces at the very beginning and the spaces or new lines at the very end, don’t create empty strings at the start or end of your words list.
00:37 It returns the list of non-empty soft strings. The words list you get back contains “Hello”, “World”, and “Python”, in this example. This default behavior is very useful when you want to quickly break a sentence into words, or process user input where people might have typed extra spaces.
00:54 Now let’s take a look at an example.
00:59
If we have a sentence or a paragraph that looks like this with inconsistent white spaces between each word, and some white spaces in the beginning and at the end of the sentence or paragraph, you can easily use the default behavior of the .split()
method to extract the words from this paragraph.
01:24
Let’s print the wordlist
and see the words that it has.
01:34
So you can see how clean and easy it was. The default .split()
method intelligently handled everything for you. It ignored the leading and trailing whitespace, treated the new lines and the various clumps of spaces between the words as a single separator, and gave you exactly what you wanted, a clean list of words.
01:54
So the key takeaway here is use the default .split()
when your goal is to get words from natural or messy text. Its algorithm is designed to handle runs of any kind of white space.
02:05
This default behavior makes .split()
an amazing tool for cleaning up and parsing text from almost any source.
Become a Member to join the conversation.