Kodeclik Blog

How to convert strings to bytes in Python

A string, such as “Kodeclik Online Academy” is represented in Python as a sequence of bytes and it is very easy to understand how the string representation (which is the list of characters) is mapped to the underlying sequence of bytes. For instance, the first byte will represent “K”, the second byte will represent “o”, and so on.
There are two ways to convert a Python string to the underlying array of bytes used to store it. The first approach is to use the “encode” method on the string. Alternatively, we can use the bytes function that can be applied on the given string. Both methods allow the specification of the encoding to be used. We will see how each of these approaches work.

Converting Python strings to bytes using the encode() method

A simple way to convert Python strings to bytes would work as follows:
Here we have created a string (called “name”), then used the method “encode” to arrive at a list of bytes (with an argument, namely “ascii” which indicates the encoding to be used). Finally we print the resulting bytelist. The output is:
Hmm. That is not so insightful. However, the “b” in front of the string essentially says that what follows it is stored in byte form which your print statement has recognized and printed in a very informative manner. If you would like to peek into the individual bytes, we update the program to:
Now the output will be:
You can confirm that the upper case letters have lower values (in the ASCII exchange format) than lower case letters. This is because they come earlier in the encoding. You can also confirm that repeat letters are encoded with the same byte representation (e.g., like “e” which is represented by 101).
Another common format is “utf-8” which can be viewed as a superset of ascii, i.e., it encodes everything that ascii does (often using the same byte representation) and goes beyond it to represent a whole range of special characters. You can update the program:
and you can see that the output is exactly the same (for this string).
One of the key differences between ascii and utf-8 is that in ascii all characters are represented using exactly one byte whereas in utf-8 some characters are represented in one byte, others might take two bytes, and so on. As a result, for ascii, fetching the third character is as simple as fetching the third byte. But for utf-8, this can be more complicated. But this is something for the Python interpreter to worry about. For your purposes you can simply use the encode() method to inspect the byte representation.

Converting Python strings to bytes using the bytes() function

A second way to convert Python strings to bytes is to use the bytes() function. Unlike the encode method, bytes() is a function to be used like:
The output is the same as before:
Once again, you can change ‘ascii’ to ‘utf-8’ and explore that form of encoding.
In summary, converting strings to bytes is very convenient in Python using either the encode method or the bytes function.
If you liked this blogpost, learn about the Python chr() function which helps you understand encodings of characters and how to check if a string is empty.
For more Python content, checkout the math.ceil() and math.floor() functions! Also learn about the math domain error in Python and how to fix it!
Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!
Want to learn Python with us? Sign up for 1:1 or small group classes.

Join our mailing list

Subscribe to get updates about our classes, camps, coupons, and more.

Copyright @ Kodeclik 2024. All rights reserved.