String Slicing
In the previous lesson, you saw how you could access individual characters in a string using indexing. In this lesson, you’ll learn how to expand that syntax to extract substrings from a string. This technique is known as string slicing. You’ll practice with the standard syntax, and learn how omitting the first or last index extends the slice. You’ll also learn how to specify a stride in a string slice by using a third index.
Here are some string slicing examples:
>>> s = 'mybacon'
>>> s[2:5]
'bac'
>>> s[2:7]
'bacon'
>>> s[0:2]
'my'
You can omit the first or last index:
>>> s = 'mybacon'
>>> s[:2]
'my'
>>> s[:5]
'mybac'
>>> s[2:]
'bacon'
>>> s[2:len(s)]
>>> s[:2] + s[2:]
'mybacon'
>>> s[:]
'mybacon'
>>> t = s[:]
>>> t
'mybacon'
>>> id(s)
4380975712
>>> id(t)
4380975712
>>> s == t
True
>>> s is t
True
>>> s[2:2]
''
>>> s[4:2]
''
Here’s some negative index slicing:
>>> s = 'mybacon'
>>> s[-5:-1]
'baco'
>>> s[2:6]
'baco'
Here’s how to slice with a stride:
>>> s = 'mybacon'
>>> s[0:7:2]
'mbcn'
>>> s[1:7:2]
'yao'
>>> s = '12345' * 5
>>> s
'1234512345123451234512345'
>>> s[::5]
'11111'
>>> s[4::5]
'55555'
>>> s[::-5]
'55555'
>>> s[::-1]
'5432154321543215432154321'
>>> s = 'tacocat'
>>> s == s[::-1]
True
>>> s[::-1]
'tacocat'
00:00
Python also allows a form of indexing syntax that extracts substrings from a string. It’s known as string slicing. The syntax that you use looks really similar to indexing. Instead of just one value being put in the square brackets, you put two with a colon (:
) in between the two. So in this example, s
is the string and m
and n
are the two values.
00:21
This will return a portion of s
starting with position m
—basically, index m
—and up to but not including position n
.
00:32 Let me show you what that looks like.
00:35
So using the same string you set up earlier where s = 'mybacon'
, let me have you try out string slicing. If you were to select from index 2
up to but not including index 5
, what would you get? 'bac'
.
00:49
So it’d be the third index, so again, 0
, 1
, 2
, 3
, 4
—but again, not including 5
, just up to 5
. So if you want to include just 'bacon'
, it’d be 2
to the end. In this case, 2
to 7
. If you wanted just 'my'
, you’d start from 0
and go to 3
. Oops. I bet you can see my mistake.
01:08
You’d go from 0
and go to 2
. There you go. So this’ll be your first index, and then the second that you’re putting in there, it will select all the way up to but not include that.
01:21
So what happens if you omit the first and/or last index? If you omit the first index—in this case, s[:n]
, n
being the end point—it’s going to start the slice at the very beginning of the string and go up to but not include the index value n
. Omitting the last index, it’s going to extend the slice from the first index m
all the way to the end of the string.
01:47
Omitting both indexes is just going to return the entire original string. Note, it’s not a copy. It’s a reference to the original string. Let’s look at these in practice. If you omit the first index, what will you get? Well, to recreate what we just did, again if you start at the beginning go up to 2
, that will return 'my'
also.
02:09
If you went up to 5
, you get all the way up to index 5
, but not including it. If you want to go from index 2
all the way to the end, you could do that as s[2:]
, without anything as your second index after the colon.
02:26
That would give you just 'bacon'
. Yay! The more cumbersome way to do that would be to put in the value of your starting index and then use the len()
function as your end index.
02:41
But really, omitting that second index is much easier. One way that you could return the entire string would be to use an index—like in this case, [2:]
, and then another of [:2]
.
02:55
If these were concatenated in the right order, starting at the beginning and going up to 2
and then add it to 2
and going to the end, you would return the original string.
03:08
And that works with any value in there. If you had s
and 4
,
03:15 you’d get the same thing. Here’s another interesting thing. If you were to put no values with neither index, only the colon in the middle, it will start from the beginning and go to the end, returning the entire string.
03:30
It’s not a copy. It’s a reference to the original string. How can you check that? So again, here’s s
, make a new text object t
. t
is equal to s
—again, nothing in front of the colon, nothing after the colon. So what’s t
? Well, it’s the same thing as far as the string value, but in actuality, there’s another function that you can use here called id()
that’s going to return the identity of the object.
03:53
If you use the function id()
with s
as the object, it returns an integer which is unique and constant throughout the lifetime of that object.
04:02
So in this case, t
—what is it? It’s identical. So t
isn’t a new object. It’s actually simply a reference to s
, not a copy.
04:12
So, that’s kind of interesting. And there are a couple other operators you can use to test the relationship. Is s
equal to t
? Yes.
04:24
And s
actually is
t
. What if you were to do a string slice that had the same value in both indexes? So you’re starting at index 2
, and you’re capturing the values up to but not including 2
. Well, that’s going to create an empty string again.
04:42 It’s the same thing as if you were to start with a high value for your first index and a lower value for your second index. It’s just going to create an empty string. Negative indexing works as well.
04:56
If you were to start with -5
and go up to -1
, you would include all of these characters out of the string. You can also do a combination of positive and negative indexes.
05:09
Using the same string object, s
, if you were to start with index -5
and go up to index -1
, what would you get? You’d get 'baco'
. So again, it’s starting from here—1
, 2
, 3
, 4
, 5
—and it’s going up to here but not including it. So just these four.
05:35
It’s like making the same slice that we did earlier from index 2
to 6
. When doing string slicing, you can also specify a stride in your string slice.
05:46
It’s adding an additional colon (:
) and a third index designates what the stride will be. It’s also called a step. So in this case, if our slice was from the beginning, 0
, to the end of our string here, 7
, and then followed a step of 2
, it would start with the very first item, which is 0
, and then skip—taking two steps—to grab index 2
, and then skip ahead two, and grab index 4
, and then go to the end here, index 6
. For another example, let’s say you used a slice of [1:7:2]
.
06:22
Well, that would start at the second index, 1
instead of 0
, and again, take steps of two. So it would grab 'y'
, 'a'
, and 'o'
.
06:33
So, here you are with your string. Let’s say you want to start at the beginning and go up to the end, but you want to grab every other character—a stride of 2
. So again, you’d get every other character. In a similar way, if you went from 1
to 7
, you’d get the other ones.
06:53
And as with the other slicing, the first and second indices—they can be omitted, meaning that you would start at the beginning and go to the end. Say you had a string of five numbers, '12345'
, and you use the other operator we used earlier, and did it 5
times, concatenating it together. So in this case, what happens if you were to put colon colon (::
)—again, this would mean go from the very beginning, and this would mean go to the very end. You want to skip every 5
.
07:25
So you’re going to start at the very beginning and then skip five, and then it’s going to grab those 1
s each time. So something similar, you were to say, “Oh, start on the fourth index, go to the end, use a stride of 5
.”
07:41 You can also use a negative stride. So, what does that look like?
07:50 In this case, it’s going to start at the end and select every five.
07:57
What happens if you put a value of -1
?
08:00
That’s going to reverse your string. A common test question is to see if something’s a palindrome. So let’s say you had a string of 'tacocat'
, and you wanted to test to see if s
, your string, is the same as s[::-1]
. Yeah, actually s[::-1]
is the same backwards.
08:27 So you can test if something was a palindrome. Next up, I’ll show you interpolating variables into a string.
DoubleA on Jan. 24, 2021
In the code snippet above:
print(str[0:5] is str[-12:7])
the index “7” in the second slice was, of course, supposed to be “-7”.
Bartosz Zaczyński RP Team on Jan. 25, 2021
@DoubleA Because strings are immutable, slicing them makes unique copies:
>>> text = "Metropolitan"
>>> id(text[0:5])
140398448746992
>>> id(text[0:5])
140398448776688
DoubleA on Jan. 25, 2021
Hi Bartosz, thanks for the propmpt respose. Unfortunately, on my machine I can’t reproduce the behavour you showed above:
>>> print(id(str[0:5]), id(str[-12:-7]), id(str[0:5]))
2056998029168 2056998029168 2056998029168
>>>
Tried both in VS code and in cmd on my Windows machine.
Cheers.
Bartosz Zaczyński RP Team on Jan. 25, 2021
@DoubleA It’s an optimization quirk of the CPython interpreter. When you put the same value multiple times on a single line, then the interpreter will cache it:
>>> text = "Metropolitan"
>>> print(id(text[0:5]), id(text[-12:-7]), id(text[0:5]))
139887855158960 139887855158960 139887855158960
However, there’s no such optimization when the expressions are placed on separate lines:
>>> text = "Metropolitan"
>>> id(text[0:5])
139887854719024
>>> id(text[-12:-7])
139887855187248
>>> id(text[0:5])
139887855515440
Become a Member to join the conversation.
DoubleA on Jan. 24, 2021
Hello Chris! Thanks for sharing your knowledge. Here’s my code snippet:
When I run the above code I get
True
. However, when I modify the above code slightly and replace the equals==
operator by theis
object identity operator, the output I get isFalse
:My understanding is that the operator
is
returnsTrue
only and only if theid
of the two objects is the same (the source). In the above example, the call of theid()
object returns two identical integers.What is the reason for the above behaviour? Aren’the two string slices above “the same object”?