Python enumerate() : Learn enumerate() and iterators with Python
What is an “iterator” and what does it mean to enumerate over an iterator? In this blogpost we will learn about iterators in Python and what a powerful concept they are.
For loops and the range() function
First, consider a simple for loop.
This program prints:
What if we want the loop to actually print from 1 to 6, instead of 0 to 5? We can do something like:
This program as expected gives:
Alternatively, the range function allows us to give a starting value (1 instead of the default zero):
Oops! Note that it begins at 1 but stops just before 6. So if you wanted 6 values to be printed you should have written:
So we have seen two irritating problems with plain vanilla for loops in this fashion. First, if you use a single parameter with range(), we need to remember the index begins with 0, not at 1. Second, if we use two parameters with range(), the starting index can begin at 1 but we need to remember that the range (second) parameter needs to be one more than the desired length.
These are painful things to remember and are often the source of programming errors. The right way to do iteration in Python is to NOT use indices directly like this and instead think of what you are iterating over. For instance, assume you have a string and want to print each character of the string on a separate line. We will first write it in the painful range() function style to underscore better programming practice:
Why is this painful? You needed to know that the length of the string “kodeclik” is 8 for this to work. You could have instead done (attempt 2):
This is slightly better but the use of the index variable “i” in the print statement makes you worry whether range gives it in the format of 0 to “1 less than the length” in order to be used in the subscripting operator in the print statement (which is how that operator works).
Iterables in Python
The recommended way in Python to do this type of iteration is really:
Wow - see how simple this really is? There are no subscripts, no range operators, no worries about “off by one” errors. How does this work? The variable “mystring” is a string and a string is an “iterable object” or just “iterable” for short. An iterable object is an object that is capable of returning its elements one-by-one in a sequential fashion. With an iterable object you can focus on the program logic instead of mundane details as we had to do with the range() function.
So if a string is an iterable object what other objects are iterable? It is really simple to check. Let's try to replace mystring with some other object and see what happens. Let us try an integer and see if we can iterate over it.
This unfortunately returns:
So numbers like integers and floats are not iterable. But strings, lists, and dictionaries are iterable.
Lets make our kodeclik example more complicated. Let us suppose, in addition to printing each character on a separate line, we also want to count from 0. We can do:
This yields, as expected, the following output:
So far so good. But the range function is back! Plus the logic for iterating over the string is handled by the string iterable but the logic for counting over the numbers is handled by a separate line of code you have written. This is not good programming practice. Again, Python provides an elegant way to achieve this purpose.
The enumerate() function in Python
Try out this code:
This yields as above:
See how elegant this is? How does this work? enumerate() is a function that takes an iterable as an argument. In our case, the iterable is a string. It returns two values: the first value is the index (beginning at 0) and the second value is the content of the string. As the for loop iterates through the iterable, the enumerate operator progressively yields the next index and the next content value. The advantage of this style of programming is that you do not have to worry about incrementing the index as we did in the previous version.
What if we wanted to start the counting at 1? Do we need to do:
You can but this is not elegant. The enumerate() function allows a second argument:
“start” is a predefined argument and you are initializing it to 1 at the start of the iteration. You can even do:
The above program, as expected, yields:
Using enumerate() with lists
Suppose we desire to print each word of the string “Kodeclik is a wonderful coding academy” on a separate line. If we did what we tried earlier:
You can instantly see that it won’t work. It will print each letter on a separate line, not each word. What we need to do is to somehow tell Python to first split the string into words (i.e, yield a list of words) and iterate over the words instead of iterating over the letters. All you need to do is:
The split() function takes a string as input and separates it into individual (sub)strings and then organizes them into a list. And a list is an iterable in Python so for loops and enumerate() work on lists.
Creating a fill-in-the-blank puzzle
Suppose we want to do the same as before but print only every alternate (odd) word and in place of the words that are skipped put a fill-in-the-blanks. Here’s a simple way to do it:
Note that enumerate returns us two values, as before. This time we are checking the index to see if it is even or odd and based on that result we either print or hide the word. We are also using end= in the print() function to print the sentence on the same line. The output looks like:
Creating dressing combinations
Suppose you have a rich wardrobe and are deciding on your outfit for the weekend. You can do something like:
As you can see you are iterating through each combination and printing them in detail. In total you have 3x3x3 = 27 combinations.
Suppose we want to list each combination with a number. Here’s a first attempt:
Oops! What went wrong? Every index starts with 1 but we need to multiply each index with the right offset in order to find the right number. Here’s the solution (we will leave it to you to understand how this works):
The result is:
Printing statistics about months
Here’s a simple program that prints statistics about months using the enumerate() function:
The output is:
There’s a different way to do it using the Python zip function.
The output is:
What does zip do? zip() is essentially allowing you to have a common index and iterate through multiple iterables simultaneously. In other words, at each step of the loop, the index is moved forward by one step in two iterables. Here both iterables are lists. The first list is a list of months and the second is a list of numbers (the number of the month). Without zip, if we were to use nested for loops, we will have iterated over every combination of months and month-indices.
Now you might not like the 13 in the above code as it looks awfully specific. Instead you could have written:
Oops - what happened? Why did December not get printed? That is because remember in the range() function, the second argument needs to be one more than what you need. The way it is currently written the first list has 12 elements and the second list has 11 elements so the zip() function can iterate only over the 11 elements because the 12th element is missing in the second list. To fix this:
Here is a solution that fixes this problem:
Here’s a more complicated example with zip involving 3 iterables:
The output is:
Mini Project: Build your own book search engine
Let's build our own search engine using what we have learnt so far. We will take a book, parse it into words, and try to determine the locations of specific words. (So our search engine will work only for this one book.) Project Guttenberg is a library of free books. You can choose any book here but for demonstration purposes, we are going to use Peter Pan by J.M. Barie.
Here’s a simple Python program to enumerate all the words and their locations:
This yields (only an excerpt in the middle shown):
Note that we are not worrying about punctuation, white space, and in general cleaning up the text before indexing it. (You will need to do this to create a robust search engine).
Let us now search for “Peter” in “Peter Pan” (duh!).
The above output means Peter occurs in index position 5, then in position 94, then in position 96, and so on. If we desire to make things more interesting and create a more contextual search engine, we can first create a dictionary which serves as an index of the positions and words and then when we find the query “Peter” in it, we can print not just the location where we found it but also the words before and after it.
First, let us create a dictionary.
Note that we are initializing the dictionary myindex to be empty and then populating it as we read through the file. Then we can do:
Note that we are printing the words before and after the query and the location where we find the first word (the word before the query). This yields:
Isn’t this fun? We will leave it to you to make your search engine more interesting. For instance, print more words before and after, and also allow the user to search for multiple keywords rather than a single keyword.
Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.