Kodeclik Blog

What is a string?

A string is a data type used in programs to denote a sequence of characters. Strings can be used to represent names, addresses, documents, emails, and messages.
Strings are available in practically every programming language. We will use Python to illustrate strings but similar notation is available in other languages as well.
Strings can be used to denote names. For instance, “Mickey Mouse” is a string which we can store in a variable called, say, “name”. Here is some Python code for this purpose:
The first line declares the variable called “name” and stores “Mickey Mouse” in it. When we print this string variable, we obtain the output:

Strings are case sensitive

Because strings are a sequence of characters, the specific sequence of characters uniquely define a string. Thus for instance, the following two variables do not denote the same string.
We can confirm that these are not the same as far as Python is concerned with the following piece of code:
The output is:
We can expand the above program to create a replica of the first string, like so:
The output is, as expected:

Strings are taken literally

A string is just a collection of characters - nothing more, nothing less. For instance, the following is a string:
The output is:
Note that 3+4 is a string because it is encapsulated in quotes. Strings are not evaluated in any manner. So “3+4” when printed gives, literally, “3+4”.

Strings should not be confused with integers

Consider the following program:
The program creates a variable called “x” and assigns it an integer value, namely the number 5. Then the program creates a variable called “y” and assigns it a string value, namely the string “5”. Of course, these two variables are not the same. As a result the rest of the program prints:

When should I use strings?

You should use strings to store and represent anything that should be interpreted literally without any calculation or computation. For instance, names, addresses, paragraphs of text, books, the content of a webpage, messages (like emails, tweets) can all be stored as strings. Note that a complete document comprising multiple paragraphs can be stored as a single string using special characters such as “newlines” (denoted as “\n” in many languages). For instance, below is a paragraph with embedded newlines:
The output of this program will be:

How are strings stored in memory?

In most programming languages, strings are stored in memory as an array of bytes, with each byte denoting a character. An encoding scheme such as Unicode or ASCII is used to represent the characters. For instance, a string such as “Kodeclik” is stored as a string of numbers, each number representing the ASCII code of one of the characters in the string. For instance, the ASCII code for “K” is 75, the ASCII code of “o” is 111, and so on. We can confirm this by using the ord() function in Python:
The output is:
You can notice from the above output that nearby characters in the alphabet have nearby encodings. For instance, “i” is represented as 105, “k” is represented as 107 because “i” and “k” are separated by only one character (“j” represented as 106).
If you pore even deeper into memory, i.e., in your hard drive, strings are stored in consecutive bytes of memory. To denote when a string ends, a special “end of string” character is often used. This is an implementation detail that most programmers will not need to worry about.

What operations can be performed on strings?

Given a string you can extract prefixes, suffixes and specific substrings of the string. For instance, in Python a string’s contents can be denoted using integer indices (beginning at 0).
The following program:
because K is the first character (denoted by name[0]) and “d” is the third character (denoted by name[2]). You can extract substrings by using range operators, eg:
This outputs:
The first line, namely print(name[0:4]) prints characters from 0 to 3 (4 denotes the end point, one number higher than the index needed), i.e., “Kode”. The second line, namely print(name[4:9]) prints characters from 4 to 8 (again 9 denotes one more than the character index intended).
There is more to learn in this space but we hope we have whetted your appetite about strings and string variables. If you liked this blogpost, checkout our blogpost on integers.

Join our mailing list

Subscribe to get updates about our classes, camps, coupons, and more.

Copyright @ Kodeclik 2024. All rights reserved.