The bytes
data type is an immutable sequence of unsigned bytes used for handling binary data in Python. You can create a bytes
object using the literal syntax, the bytes()
function, or the bytes.fromhex()
method. Since bytes
are closely related to strings, you often convert between the two data types, applying the correct character encoding.
By the end of this tutorial, you’ll understand that:
- Python
bytes
objects are immutable sequences of unsigned bytes used for handling binary data. - The difference between
bytes
andbytearray
is thatbytes
objects are read-only, whilebytearray
objects are mutable. - You convert a Python string to
bytes
using thestr.encode()
method, thebytes()
function, or thecodecs
module. - Endianness refers to the byte order used to represent binary data in memory, which can be either little-endian or big-endian.
This tutorial starts with a brief overview of binary data fundamentals, setting the scene for the remaining part, which delves into creating and manipulating bytes
objects in Python. Along the way, it touches on related topics, such as bytearray
, bytes-like objects, and the buffer protocol. To top it off, you’ll find several real-life examples and exercises at the end, which demonstrate the concepts discussed.
To get the most out of this tutorial, you should be familiar with Python basics, particularly built-in data types.
Get Your Code: Click here to download the free sample code that you’ll use to learn about bytes objects and handling binary data in Python.
Take the Quiz: Test your knowledge with our interactive “Python Bytes” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python BytesIn this quiz, you'll test your understanding of Python bytes objects. By working through this quiz, you'll revisit the key concepts related to this low-level data type.
Brushing Up on Binary Fundamentals
If you’re new to binary data or need a quick refresher, then consider sticking around. This section will provide a brief overview of binary representations, emphasizing a Python programmer’s perspective. On the other hand, if you’re already comfortable with the basics, then feel free to dive right into creating bytes
objects in Python.
Bits, Bytes, and Binary Data
Virtually every piece of information, from books and music to movies, can be stored as binary data in a computer’s memory. The word binary implies that the information is stored as a sequence of binary digits, or bits for short. Each bit can hold a value of either one or zero, which is particularly well-suited for storage in electronic devices since they often use distinct voltage levels to represent these binary states.
For example, the binary sequence below may represent the color of a pixel in an image:
1 1 0 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 0 0 1 |
To make the interpretation of such binary sequences more systematic, you often arrange the individual bits into uniform groups. The standard unit of information in modern computing consists of exactly eight bits, which is why it’s sometimes known as an octet, although most people call it a byte. A single 8-bit byte allows for 256 possible bit combinations (28).
With this in mind, you can break up the bit sequence above into these three bytes:
0 0 1 1 0 0 0 0 |
1 0 0 0 1 1 0 0 |
1 1 0 0 1 0 0 1 |
Notice that the leftmost byte has been padded with two leading zeros to ensure a consistent number of bits across all bytes. Together, they form a 24-bit color depth, letting you choose from more than 16 million (224) unique colors per pixel.
In this case, each byte corresponds to one of three primary colors (red, green, and blue) within the RGB color model, effectively serving as coordinates in the RGB color space. Changing their proportions can be loosely compared to mixing paints to achieve a desired hue.
Note: Strictly speaking, the RGB color model is an additive one, meaning it combines specific wavelengths of light to synthesize complex colors. In contrast, paint mixing follows a subtractive model, where pigments absorb certain wavelengths of light from the visible spectrum.
To reveal the pixel’s primary colors as decimal numbers, you can open the Python REPL and define binary literals by prefixing the corresponding bit sequences with 0b
:
>>> 0b00110000, 0b10001100, 0b11001001
(48, 140, 201)
Binary literals are an alternative way of defining integers in Python. Other types of numeric literals include hexadecimal and octal. For example, you can represent the integer 48
as 0x30
in hexadecimal or 0o60
in octal, allowing you to write the same number differently.
Having such flexibility comes in handy since it’s customary to express byte values using the hexadecimal numeral system. By rewriting each byte as a two-digit hex number, you can represent your pixel color much more compactly compared to the equivalent binary sequence:
>>> hex(48), hex(140), hex(201)
('0x30', '0x8c', '0xc9')
>>> int("308cc9", base=16)
3181769
>>> int("001100001000110011001001", base=2)
3181769
Calling the built-in hex()
function on an integer returns the corresponding hexadecimal literal as a string. When you combine the resulting hex numbers, you’re able to describe a 24-bit color with just six digits (308cc9
). Go ahead and open an online color picker to see what that encoded value looks like: