Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Splitting, Concatenating, and Joining Python Strings
There are few guarantees in life: death, taxes, and programmers needing to deal with strings. Strings can come in many forms. They could be unstructured text, usernames, product descriptions, database column names, or really anything else that we describe using language.
With the near-ubiquity of string data, it’s important to master the tools of the trade when it comes to strings. Luckily, Python makes string manipulation very simple, especially when compared to other languages and even older versions of Python.
In this article, you will learn some of the most fundamental string operations: splitting, concatenating, and joining. Not only will you learn how to use these tools, but you will walk away with a deeper understanding of how they work under the hood.
Take the Quiz: Test your knowledge with our interactive “Splitting, Concatenating, and Joining Strings in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Splitting, Concatenating, and Joining Strings in PythonIn this quiz, you can test your Python skills when it comes to the most fundamental string operations: splitting, concatenating, and joining.
Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.
Splitting Strings
In Python, strings are represented as str
objects, which are immutable: this means that the object as represented in memory can not be directly altered. These two facts can help you learn (and then remember) how to use .split()
.
Have you guessed how those two features of strings relate to splitting functionality in Python? If you guessed that .split()
is an instance method because strings are a special type, you would be correct! In some other languages (like Perl), the original string serves as an input to a standalone .split()
function rather than a method called on the string itself.
Note: Ways to Call String Methods
String methods like .split()
are mainly shown here as instance methods that are called on strings. They can also be called as static methods, but this isn’t ideal because it’s more “wordy.” For the sake of completeness, here’s an example:
# Avoid this:
str.split('a,b,c', ',')
This is bulky and unwieldy when you compare it to the preferred usage:
# Do this instead:
'a,b,c'.split(',')
For more on instance, class, and static methods in Python, check out our in-depth tutorial.
What about string immutability? This should remind you that string methods are not in-place operations, but they return a new object in memory.
Note: In-Place Operations
In-place operations are operations that directly change the object on which they are called. A common example is the .append()
method that is used on lists: when you call .append()
on a list, that list is directly changed by adding the input to .append()
to the same list.
Splitting Without Parameters
Before going deeper, let’s look at a simple example:
>>> 'this is my string'.split()
['this', 'is', 'my', 'string']
This is actually a special case of a .split()
call, which I chose for its simplicity. Without any separator specified, .split()
will count any whitespace as a separator.
Another feature of the bare call to .split()
is that it automatically cuts out leading and trailing whitespace, as well as consecutive whitespace. Compare calling .split()
on the following string without a separator parameter and with having ' '
as the separator parameter:
>>> s = ' this is my string '
>>> s.split()
['this', 'is', 'my', 'string']
>>> s.split(' ')
['', 'this', '', '', 'is', '', 'my', 'string', '']
The first thing to notice is that this showcases the immutability of strings in Python: subsequent calls to .split()
work on the original string, not on the list result of the first call to .split()
.
The second—and the main—thing you should see is that the bare .split()
call extracts the words in the sentence and discards any whitespace.
Specifying Separators
.split(' ')
, on the other hand, is much more literal. When there are leading or trailing separators, you’ll get an empty string, which you can see in the first and last elements of the resulting list.
Where there are multiple consecutive separators (such as between “this” and “is” and between “is” and “my”), the first one will be used as the separator, and the subsequent ones will find their way into your result list as empty strings.
Note: Separators in Calls to .split()
While the above example uses a single space character as a separator input to .split()
, you aren’t limited in the types of characters or length of strings you use as separators. The only requirement is that your separator be a string. You could use anything from "..."
to even "separator"
.
Limiting Splits With Maxsplit
.split()
has another optional parameter called maxsplit
. By default, .split()
will make all possible splits when called. When you give a value to maxsplit
, however, only the given number of splits will be made. Using our previous example string, we can see maxsplit
in action:
>>> s = "this is my string"
>>> s.split(maxsplit=1)
['this', 'is my string']
As you see above, if you set maxsplit
to 1
, the first whitespace region is used as the separator, and the rest are ignored. Let’s do some exercises to test out everything we’ve learned so far.
What happens when you give a negative number as the maxsplit
parameter?
.split()
will split your string on all available separators, which is also the default behavior when maxsplit
isn’t set.
You were recently handed a comma-separated value (CSV) file that was horribly formatted. Your job is to extract each row into an list, with each element of that list representing the columns of that file. What makes it badly formatted? The “address” field includes multiple commas but needs to be represented in the list as a single element!
Assume that your file has been loaded into memory as the following multiline string:
Name,Phone,Address
Mike Smith,15554218841,123 Nice St, Roy, NM, USA
Anita Hernandez,15557789941,425 Sunny St, New York, NY, USA
Guido van Rossum,315558730,Science Park 123, 1098 XG Amsterdam, NL
Your output should be a list of lists:
[
['Mike Smith', '15554218841', '123 Nice St, Roy, NM, USA'],
['Anita Hernandez', '15557789941', '425 Sunny St, New York, NY, USA'],
['Guido van Rossum', '315558730', 'Science Park 123, 1098 XG Amsterdam, NL']
]
Each inner list represents the rows of the CSV that we’re interested in, while the outer list holds it all together.
Here’s my solution. There are a few ways to attack this. The important thing is that you used .split()
with all its optional parameters and got the expected output:
input_string = """Name,Phone,Address
Mike Smith,15554218841,123 Nice St, Roy, NM, USA
Anita Hernandez,15557789941,425 Sunny St, New York, NY, USA
Guido van Rossum,315558730,Science Park 123, 1098 XG Amsterdam, NL"""
def string_split_ex(unsplit):
results = []
# Bonus points for using splitlines() here instead,
# which will be more readable
for line in unsplit.split('\n')[1:]:
results.append(line.split(',', maxsplit=2))
return results
print(string_split_ex(input_string))
We call .split()
twice here. The first usage can look intimidating, but don’t worry! We’ll step through it, and you’ll get comfortable with expressions like these. Let’s take another look at the first .split()
call: unsplit.split('\n')[1:]
.
The first element is unsplit
, which is just the variable that points to your input string. Then we have our .split()
call: .split('\n')
. Here, we are splitting on a special character called the newline character.
What does \n
do? As the name implies, it tells whatever is reading the string that every character after it should be shown on the next line. In a multiline string like our input_string
, there is a hidden \n
at the end of each line.
The final part might be new: [1:]
. The statement so far gives us a new list in memory, and [1:]
looks like a list index notation, and it is—kind of! This extended index notation gives us a list slice. In this case, we take the element at index 1
and everything after it, discarding the element at index 0
.
In all, we iterate through a list of strings, where each element represents each line in the multiline input string except for the very first line.
At each string, we call .split()
again using ,
as the split character, but this time we are using maxsplit
to only split on the first two commas, leaving the address intact. We then append the result of that call to the aptly named results
array and return it to the caller.
Concatenating and Joining Strings
The other fundamental string operation is the opposite of splitting strings: string concatenation. If you haven’t seen this word, don’t worry. It’s just a fancy way of saying “gluing together.”
Concatenating With the +
Operator
There are a few ways of doing this, depending on what you’re trying to achieve. The simplest and most common method is to use the plus symbol (+
) to add multiple strings together. Simply place a +
between as many strings as you want to join together:
>>> 'a' + 'b' + 'c'
'abc'
In keeping with the math theme, you can also multiply a string to repeat it:
>>> 'do' * 2
'dodo'
Remember, strings are immutable! If you concatenate or repeat a string stored in a variable, you will have to assign the new string to another variable in order to keep it.
>>> orig_string = 'Hello'
>>> orig_string + ', world'
'Hello, world'
>>> orig_string
'Hello'
>>> full_sentence = orig_string + ', world'
>>> full_sentence
'Hello, world'
If we didn’t have immutable strings, full_sentence
would instead output 'Hello, world, world'
.
Another note is that Python does not do implicit string conversion. If you try to concatenate a string with a non-string type, Python will raise a TypeError
:
>>> 'Hello' + 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be str, not int
This is because you can only concatenate strings with other strings, which may be new behavior for you if you’re coming from a language like JavaScript, which attempts to do implicit type conversion.
Going From a List to a String in Python With .join()
There is another, more powerful, way to join strings together. You can go from a list to a string in Python with the join()
method.
The common use case here is when you have an iterable—like a list—made up of strings, and you want to combine those strings into a single string. Like .split()
, .join()
is a string instance method. If all of your strings are in an iterable, which one do you call .join()
on?
This is a bit of a trick question. Remember that when you use .split()
, you call it on the string or character you want to split on. The opposite operation is .join()
, so you call it on the string or character you want to use to join your iterable of strings together:
>>> strings = ['do', 're', 'mi']
>>> ','.join(strings)
'do,re,mi'
Here, we join each element of the strings
list with a comma (,
) and call .join()
on it rather than the strings
list.
How could you make the output text more readable?
One thing you could do is add spacing:
>>> strings = ['do', 're', 'mi']
>>> ', '.join(strings)
'do, re, mi'
By doing nothing more than adding a space to our join string, we’ve vastly improved the readability of our output. This is something you should always keep in mind when joining strings for human readability.
.join()
is smart in that it inserts your “joiner” in between the strings in the iterable you want to join, rather than just adding your joiner at the end of every string in the iterable. This means that if you pass an iterable of size 1
, you won’t see your joiner:
>>> 'b'.join(['a'])
'a'
Using our web scraping tutorial, you’ve built a great weather scraper. However, it loads string information in a list of lists, each holding a unique row of information you want to write out to a CSV file:
[
['Boston', 'MA', '76F', '65% Precip', '0.15 in'],
['San Francisco', 'CA', '62F', '20% Precip', '0.00 in'],
['Washington', 'DC', '82F', '80% Precip', '0.19 in'],
['Miami', 'FL', '79F', '50% Precip', '0.70 in']
]
Your output should be a single string that looks like this:
"""
Boston,MA,76F,65% Precip,0.15in
San Francisco,CA,62F,20% Precip,0.00 in
Washington,DC,82F,80% Precip,0.19 in
Miami,FL,79F,50% Precip,0.70 in
"""
For this solution, I used a list comprehension, which is a powerful feature of Python that allows you to rapidly build lists. If you want to learn more about them, check out this great article that covers all the comprehensions available in Python.
Below is my solution, starting with a list of lists and ending with a single string:
input_list = [
['Boston', 'MA', '76F', '65% Precip', '0.15 in'],
['San Francisco', 'CA', '62F', '20% Precip', '0.00 in'],
['Washington', 'DC', '82F', '80% Precip', '0.19 in'],
['Miami', 'FL', '79F', '50% Precip', '0.70 in']
]
# We start with joining each inner list into a single string
joined = [','.join(row) for row in input_list]
# Now we transform the list of strings into a single string
output = '\n'.join(joined)
print(output)
Here we use .join()
not once, but twice. First, we use it in the list comprehension, which does the work of combining all the strings in each inner list into a single string. Next, we join each of these strings with the newline character \n
that we saw earlier. Finally, we simply print the result so we can verify that it is as we expected.
Tying It All Together
While this concludes this overview of the most basic string operations in Python (splitting, concatenating, and joining), there is still a whole universe of string methods that can make your experiences with manipulating strings much easier.
Once you have mastered these basic string operations, you may want to learn more. Luckily, we have a number of great tutorials to help you complete your mastery of Python’s features that enable smart string manipulation:
- Python’s F-String for String Interpolation and Formatting
- Python String Formatting Best Practices
- Strings and Character Data in Python
Take the Quiz: Test your knowledge with our interactive “Splitting, Concatenating, and Joining Strings in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Splitting, Concatenating, and Joining Strings in PythonIn this quiz, you can test your Python skills when it comes to the most fundamental string operations: splitting, concatenating, and joining.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Splitting, Concatenating, and Joining Python Strings