Defining bytes Objects with bytes()

In the last lesson, you saw how you could create a bytes object using a string literal with the addition of a 'b' prefix. In this lesson, you’ll learn how to use bytes() to create a bytes object. You’ll explore three different forms of using bytes():

  1. bytes(<s>, <encoding>) creates a bytes object from a string.
  2. bytes(<size>) creates a bytes object consisting of null (0x00) bytes.
  3. bytes(<iterable>) creates a bytes object from an iterable.

Here’s how to use bytes(<s>, <encoding>):

>>> a = bytes('bacon and egg', 'utf8')
>>> a
b'bacon and egg'
>>> type(a)
<class 'bytes'>

>>> b = bytes('Hello ∑ €', 'utf8')
>>> b
b'Hello \xe2\x88\x91 \xe2\x82\xac'

>>> len(a)
>>> a
b'bacon and egg'
>>> b
b'Hello \xe2\x88\x91 \xe2\x82\xac'
>>> len(b)
>>> a[0]
>>> a[1]
>>> a[2]
>>> b[0]
>>> b[1]
>>> b[5]
>>> b[6]
>>> b[7]
>>> b[8]

Here’s how to use bytes(<size>):

>>> c = bytes(8)
>>> c

Here’s how to use bytes(<iterable>):

>>> d = bytes([115, 112, 97, 109, 33])
>>> d
>>> type(d)
<class 'bytes'>
>>> d[0]
>>> d[3]

theramstoss on June 4, 2020

Question for you: why does bytes(‘\x80’, ‘utf8’) evaluate to b’\xc2\x80’ ?

Thank you!

Chris Bailey RP Team on June 4, 2020

Hi @theamstoss,

You are heading in a deeper direction when you start to look at encodings. The utf-8 standard encodes in multiple byte sizes. This article and there will be a video course for it soon. They really do a good deep dive. Here is a code snippet from the article, showing characters just outside the ASCII group, in this case they have accents, being encoded in utf-8 as 2 bytes. But the other ASCII characters are single letters.

>>> "résumé".encode("utf-8")
>>> "El Niño".encode("utf-8")
b'El Ni\xc3\xb1o'

>>> b"r\xc3\xa9sum\xc3\xa9".decode("utf-8")
>>> b"El Ni\xc3\xb1o".decode("utf-8")
'El Niño'

The value you have picked of '\x80' is equal to 128, and takes you just out of ASCII and the lower 0-127.

