Locked learning resources

Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Locked learning resources

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Splitting With a Delimiter

00:00 While splitting by whitespace is super useful. There will be many times when your data isn’t separated by white spaces. It might be separated by a very specific character or even a sequence of characters.

00:12 It’s called a delimiter. Think of CSV files where everything is separated by commas, or maybe log files where different pieces of information are separated by pipe symbols or tabs.

00:23 Next, you are going to learn how to tell the .split() method exactly what character or characters to use as its cutting points.

00:32 What exactly is a delimiter? It’s the special character, or even a group of characters, that signal a boundary or a separation point between different pieces of information you care about within a larger string.

00:45 Besides the common delimiters, like comma spaces, tabs, or newlines, you can use any character as a delimiter, like pipe symbol, semicolon, colon, or even multi-character sequences.

00:57 The point is, the .split() method is a very versatile method and you can use any character or sequence of characters as its delimiter. Now, you absolutely must know or figure out by looking at your data how the pieces of information are separated.

01:12 If you tell the .split() method to use the wrong delimiter, the results you get will be incorrect or not what you expect. For example, if your data is separated by tabs and you try to split by commas, you likely just get your original string back as a single item in a list, or a list of strings that is separated by commas, not separated by tabs.

01:34 Now let’s see how we can use this delimiter in the .split() method.

01:39 So we pass the delimiter, such as a comma, semicolon, tab, or any other character, to the separator argument of the .split() method to specify how the string should be divided into parts.

01:51 Let’s look at some examples.

01:54 One of the most common tasks you’ll face is processing CSV like data, which stands for comma-separated values. If you have a string of fruit names separated by comma, just pass the comma as the separator argument, and you’ll have a list of fruit names. By passing comma as the argument to .split(), you’ve explicitly told Python to use commas as the cutting point.

02:15 It goes through the string and breaks it apart every time it finds a comma.

02:25 Handling file paths is another great example as file paths use forward slashes on Mac or Linux and backslashes on Windows. Just make sure you use the correct slash as the delimiter when splitting file paths.

02:42 In this example, you use the forward slash as the delimiter to break the path into its individual components. This can be very useful for getting just the filename or navigating directory structures.

02:57 Using the .split() method to handle CSV files or file paths is a quick way to do it. However, for any serious work with CSV files or file paths, you should use Python’s dedicated modules. For CSV files, the built-in csv module is much more robust, and for file paths, the pathlib module is the modern, powerful, and operating system aware way to handle them.

03:19 You will find dedicated tutorials for the csv module and the pathlib module in the additional resources section.

03:28 Till now, you’ve seen how you can use the .split() method in its default form and how you can use delimiters to split using any character that you want.

03:36 That’s great when your data is clean and always uses the same delimiter. But what if you have a string where the data is sometimes separated by a comma, sometimes by a semicolon, or maybe sometimes by a pipe symbol?

03:48 Not to worry. The .split() method allows you to use multiple delimiters, too. Let’s see how it’s done.

03:56 Now, let’s say you have a string like this where the required items are separated by multi-character delimiters. Normally, till now, you’ve used the .split() method with single-character delimiters, like a comma or a space, or used the .split() method in its default form without any separator argument.

04:12 But here the items are separated by multi-character delimiters. No problem. You can still get the list easily.

04:23 You can just do fruits.split( and then pass the multi-character delimiter.

04:31 Let’s print the fruit_list,

04:40 and now you have the list of fruits. So yes, Python’s .split() method works with multi-character delimiters, too. Now that you know how to split using multiple delimiters, let’s move on and learn how to control how many splits the .split() method should make.

Become a Member to join the conversation.