Operations on bytes Objects
In this lesson, you’ll explore the common sequence operations that bytes
objects support. You’ll take a closer look at:
- The
in
andnot in
operators - Concatenation (
+
) and replication (*
) operators - Indexing and slicing
- Built-in functions
len()
,min()
, andmax()
- Methods for
bytes
objects bytes.fromhex(<s>)
andb.hex()
For more information about hexadecimal values, check out the following resources:
Here’s how to use the in
and not in
operators:
>>> a = b'abcde'
>>> a
b'abcde'
>>> b'cd' in a
True
>>> b'spam' in a
False
>>> b'spam' not in a
True
Here’s how to use the concatenation (+
) and replication (*
) operators:
>>> a = b'abcde'
>>> a
b'abcde
>>> b = b'fghij'
>>> b
b'fghij'
>>> a + b
b'abcdefghij'
>>> a * 3
b'fghijfghijfghij'
Here’s how to do indexing and slicing:
>>> a = b'abcde'
>>> a[2]
99
>>> a[1]
98
>>> a[2:4]
b'cd'
>>> a[1:5]
b'bcde'
Here’s how to use the built-in functions len()
, min()
, and max()
:
>>> a = b'abcde'
>>> len(a)
5
>>> max(a)
101
>>> chr(101)
'e'
>>> min(a)
97
>>> chr(97)
'a'
Here’s how to use the methods for bytes
objects:
>>> a = b'spam,egg,spam,bacon,spam,lobster'
>>> a
b'spam,egg,spam,bacon,spam,lobster'
>>> a.count(b'spam')
3
>>> a.count('spam')
Traceback (most recent call last):
File "<input>", line 1, in <module>
a.count('spam')
TypeError: argument should be integer or bytes-like object, not 'str'
>>> a.endswith(b'ster')
True
>>> a.find(b'bacon')
14
>>> a
b'spam,egg,spam,bacon,spam,lobster'
>>> a.split(sep=b',')
[b'spam', b'egg', b'spam', b'bacon' , b'spam', b'lobster']
>>> a.center(40, b'-')
b'----spam,egg,spam,bacon,spam,lobster----'
>>> a.center(40, ' ')
Traceback (most recent call last):
File "<input>", line 1, in <module>
a.center(40, ' ')
TypeError: center() argument 2 must be a byte string of length 1, not 'str'
Here’s how to use bytes.fromhex(<s>)
and b.hex()
:
>>> a = b'spam,egg,spam,bacon,spam,lobster'
>>> a[1]
112
>>> a[3]
109
>>> hex(a[0])
'0x73'
>>> a[0]
115
>>> list(a)
[115, 112, 97, 109, 44, 101, 103, 103, 44, 115, 112, 97, 109, 44, 98, 97, 99, 111, 110, 44, 115, 112, 97, 109, 44, 108, 111, 98, 115, 116, 101, 114]
>>> b = bytes.fromhex(' aa 68 32 af ')
>>> b
b'\xaah2\xaf'
>>> list(b)
[170, 104, 50, 175]
>>> b
b'\xaah2\xaf'
>>> b.hex()
'aa6832af'
>>> type(b.hex())
<class 'str'>
00:00
This video is about operations on bytes
objects. bytes
objects support the common sequence operations that you’ve used up to this point: The in
and not in
operators, concatenation and replication operators.
00:18
You can do indexing and slicing. And then built-in Python functions length—len()
—min()
, and max()
can be used on bytes
objects also, along with type()
.
00:29
And many of the methods for string objects are valid for bytes
objects also. And last, here’s a couple of unique methods: .fromhex()
and .hex()
.
00:41
For these examples, create a bytes
object a
.
00:46
Let’s use a few alphabetic characters. Here’s a
. So can you say,
00:56
“Is this bytes
object with 'cd'
in it in
a
?” Yes, that’s True
. Now check to see if 'spam'
is in a
. And similarly, not only can you say is something in
, but not in
works also.
01:14
For concatenation and replication—here’s a
—let’s create another object b
. So, here’s b
. If you were to take a
and concatenate it to b
, what do you get? A much larger bytes
object. And similarly, if you were to take b
and use replication, put a value of 3
in it, it will replicate and concatenate three times.
01:37
If you were to take something like a
—and you’ve practiced this earlier, but does slicing work, and indexing? Yes. What’s a little different, though, is that it is showing the byte value. In this case, the letter is 'c'
but the index will show the ASCII value. Index 1
for 'b'
would be 98
.
01:54
But what’s kind of interesting is if you were to create a slice, say from 2
to 4
, it does show the letters and returns a byte string. And just for practice, make another slice that goes from 1
up to 5
.
02:10
As far as built-in Python functions, if you look again at the bytes
object that you’ve created, len()
returns the number of items in a container. len()
, if I put the letter a
in there for our object—there we go.
02:24
It’s 5
bytes long. And a couple of interesting ones—if you were to use something like max()
, which will look through all the bytes there and return the largest one.
02:34
So, what’s the max(a)
? Well, it’s returning 101
. 101
is an integer value for that. In fact, if you needed to check, 101
is going to return that Unicode string for that character.
02:47
So 101
is the letter 'e'
, which would be the maximum out of that list of 'abcde'
. And min()
, if done to the same object, returns the lowest.
02:58
So here you’re reusing some of the things you learned earlier. Pretty neat! So, these built-in functions from Python—len()
, min()
, and max()
—all work also. What about some of the methods that work on strings? Would they work on bytes
objects also? Yes. Create a new object a
, and in it,
03:21
I’m going to repeat the word b'spam'
several times to practice something here. So, here’s a
, and you can see if I press the .
after that object, it now shows quite a few of the ones that we’ve already learned as far as methods that are available. Let’s try .count()
. So, .count()
looks for a substring.
03:39
Here’s something that’s interesting though: you need to use a bytes
object. So if you’re looking for b'spam'
inside of this existing one here, you can’t only place the text string of 'spam'
. You need it to be a bytes
object, and you can see that it occurs 3
times inside there. In fact, if I try this with only the text, it says, argument should be integer or bytes-like object
—not a string. You can use .endswith()
. In this case, we could say b'ster'
. True
, it does end with b'ster'
.
04:17
Where does b'bacon'
begin? It begins in the 14th byte. Mm, bacon.
04:24
What about .split()
? Again, a
looks like this. So in this case, .split()
requires its separating delimiter. In that case, it does need to be a bytes
object. There we go!
04:38
It’s returned all of those in a list. Even something like .center()
,
04:47
you could say 40
—and again, it does need to be a bytes
object for that fillchar
also.
05:01
So again, remember that all these methods being applied to a bytes
object require that the operands and arguments, if they’re normally a string in that position, they need to be a bytes
object in this case.
05:13
So I can’t just simply put a space. Again, argument 2 must be a byte string
, not a string. Although a bytes
object definition and representation is based upon ASCII text, it actually behaves like an immutable sequence of small integers in the range of 0
to 255
.
05:32
That is why any single element from the bytes
object is displayed as an integer.
05:40
You can use the function hex()
to check what the hexadecimal value would be for any individual character. In this case, using index 0
, the hexadecimal value is '0x73'
for that byte, versus the integer value.
06:02
Also, you can convert bytes
objects into a list of integers with the built-in list()
function. So if you were to take list()
and apply that to a
, it’s going to go ahead and break this out into a list of all the integers that make up the bytes for the bytes
object a
. I briefly showed you hex()
, which returns a hexadecimal representation for that integer.
06:27
There’s another one called .fromhex()
. Create a new object b
, and what you’re going to do is use this method .fromhex()
.
06:37
It’s going to create a bytes
object from a string, in this case, of hexadecimal numbers. And spaces are allowed between there, so you could say ' aa 68 32 af '
.
06:59
That created a bytes
object, and you can see the hexadecimal values that have been entered into it here. In this case, aa
, the letter h
the number 2
, and then another hexadecimal byte value of af
.
07:18
What do the integers look like for that? 170
—again, above 128—104
, 50
, and 175
. If you have hexadecimal values, hexadecimal digit pairs, they can be converted into a bytes
object also.
07:35
So that same object b
, if you were to look at the .hex()
of it, which is another method, it’s going to output a string. It’s going to basically do the reverse. And there you can see it put back together. It did remove the spaces though.
07:48 Those are all the hexadecimal values that make up those four bytes. And again, you might’ve noticed that as it said this method, it’s going to return a string. Yep.
07:59
The class is a str
, is what’s going to be returned from .hex()
. Next up, bytearray
objects.
Veda on Sept. 13, 2020
How does the console on VS Code look so colorful? Which plugin are you using?
Ricky White RP Team on Sept. 13, 2020
I believe Chris uses BPython as his Python REPL. That is what you are seeing inside his VS Code Terminal
Veda on Sept. 13, 2020
Thanks for your reply. I tried installing it on windows, but getting different dependency error while invoking it. (module fcntl not found ). I guess I have to stick with usual default REPL.
Veda on Sept. 13, 2020
Got it fixed. On windows we need to use python -m bpython.cli instead of python to get to this CLI
Chris Bailey RP Team on Sept. 14, 2020
Hi @Veda, I’m glad you found a solution! bptyhon
on windows is a bit of work sometimes. I have found that it is often due to the support of some of the underlying pacakges (curses, etc).
In the future if you would like to look at some possilbe alternatives, with similar features: ptpython
and ipython
.
Become a Member to join the conversation.
Alain Rouleau on July 29, 2020
That
bytes.fromhex(' aa 68 32 af ')
was quite the head-scratcher! And as to why the output has both an'h'
and a'2'
inb'\xaah2\xaf'
? You know, there’s no'h'
in hexadecimal and why'2'
?But what appears to be happening is that hexadecimal 68 equals decimal 104 which in turn is ascii for
'h'
. Plus hexidecimal 32 is decimal 50 which in turn is ascii for'2'
. All pretty crazy!