Kodeclik Blog
Python enumerate() : Learn enumerate() and iterators with Python
Python's enumerate() method is a built-in function in Python that can be used to setup an index or a counter to an iterable. An iterable, an object that supports iteration, is a data type in Python that contains a set of items. List and tuple are examples of iterables. The enumerate() function adds an index, that starts at zero, to the iterable.
What is an “iterator” and what does it mean to enumerate over an iterator? In this blogpost we will learn about iterators in Python and what a powerful concept they are.
For loops and the range() function
First, consider a simple for loop.
for i in range(6):
print(i)
This program prints:
0
1
2
3
4
5
What if we want the loop to actually print from 1 to 6, instead of 0 to 5? We can do something like:
for i in range(6):
print(i+1)
This program as expected gives:
1
2
3
4
5
6
Alternatively, the range function allows us to give a starting value (1 instead of the default zero):
for i in range(1,6):
print(i)
This gives:
1
2
3
4
5
Oops! Note that it begins at 1 but stops just before 6. So if you wanted 6 values to be printed you should have written:
for i in range(1,7):
print(i)
This gives:
1
2
3
4
5
6
So we have seen two irritating problems with plain vanilla for loops in this fashion. First, if you use a single parameter with range(), we need to remember the index begins with 0, not at 1. Second, if we use two parameters with range(), the starting index can begin at 1 but we need to remember that the range (second) parameter needs to be one more than the desired length.
These are painful things to remember and are often the source of programming errors. The right way to do iteration in Python is to NOT use indices directly like this and instead think of what you are iterating over. For instance, assume you have a string and want to print each character of the string on a separate line. We will first write it in the painful range() function style to underscore better programming practice:
mystring = "kodeclik"
for i in range(8):
print(mystring[i])
Why is this painful? You needed to know that the length of the string “kodeclik” is 8 for this to work. You could have instead done (attempt 2):
mystring = "kodeclik"
for i in range(len(mystring)):
print(mystring[i])
This is slightly better but the use of the index variable “i” in the print statement makes you worry whether range gives it in the format of 0 to “1 less than the length” in order to be used in the subscripting operator in the print statement (which is how that operator works).
Iterables in Python
The recommended way in Python to do this type of iteration is really:
mystring = "kodeclik"
for i in mystring:
print(i)
Wow - see how simple this really is? There are no subscripts, no range operators, no worries about “off by one” errors. How does this work? The variable “mystring” is a string and a string is an “iterable object” or just “iterable” for short. An iterable object is an object that is capable of returning its elements one-by-one in a sequential fashion. With an iterable object you can focus on the program logic instead of mundane details as we had to do with the range() function.
So if a string is an iterable object what other objects are iterable? It is really simple to check. Let's try to replace mystring with some other object and see what happens. Let us try an integer and see if we can iterate over it.
mynumber = 1000
for i in mynumber:
print(i)
This unfortunately returns:
TypeError: 'int' object is not iterable
So numbers like integers and floats are not iterable. But strings, lists, and dictionaries are iterable.
Lets make our kodeclik example more complicated. Let us suppose, in addition to printing each character on a separate line, we also want to count from 0. We can do:
mystring = "kodeclik"
position = 0
for i in mystring:
print(position, i)
position +=1
This yields, as expected, the following output:
0 k
1 o
2 d
3 e
4 c
5 l
6 i
7 k
So far so good. But the range function is back! Plus the logic for iterating over the string is handled by the string iterable but the logic for counting over the numbers is handled by a separate line of code you have written. This is not good programming practice. Again, Python provides an elegant way to achieve this purpose.
The enumerate() function in Python
Try out this code:
mystring = "kodeclik"
for (position, i) in enumerate(mystring):
print(position, i)
This yields as above:
0 k
1 o
2 d
3 e
4 c
5 l
6 i
7 k
See how elegant this is? How does this work? enumerate() is a function that takes an iterable as an argument. In our case, the iterable is a string. It returns two values: the first value is the index (beginning at 0) and the second value is the content of the string. As the for loop iterates through the iterable, the enumerate operator progressively yields the next index and the next content value. The advantage of this style of programming is that you do not have to worry about incrementing the index as we did in the previous version.
What if we wanted to start the counting at 1? Do we need to do:
mystring = "kodeclik"
for (position, i) in enumerate(mystring):
print(position+1, i)
You can but this is not elegant. The enumerate() function allows a second argument:
mystring = "kodeclik"
for (position, i) in enumerate(mystring, start=1):
print(position, i)
“start” is a predefined argument and you are initializing it to 1 at the start of the iteration. You can even do:
mystring = "kodeclik"
for (position, i) in enumerate(mystring, start=300):
print(position, i)
The above program, as expected, yields:
300 k
301 o
302 d
303 e
304 c
305 l
306 i
307 k
Using enumerate() with lists
Suppose we desire to print each word of the string “Kodeclik is a wonderful coding academy” on a separate line. If we did what we tried earlier:
mystring = "Kodeclik is a wonderful coding academy"
for i in mystring:
print(i)
You can instantly see that it won’t work. It will print each letter on a separate line, not each word. What we need to do is to somehow tell Python to first split the string into words (i.e, yield a list of words) and iterate over the words instead of iterating over the letters. All you need to do is:
mystring = "Kodeclik is a wonderful coding academy"
for i in mystring.split():
print(i)
which yields:
Kodeclik
is
a
wonderful
coding
academy
The split() function takes a string as input and separates it into individual (sub)strings and then organizes them into a list. And a list is an iterable in Python so for loops and enumerate() work on lists.
Creating a fill-in-the-blank puzzle
Suppose we want to do the same as before but print only every alternate (odd) word and in place of the words that are skipped put a fill-in-the-blanks. Here’s a simple way to do it:
mystring = "Kodeclik is a wonderful coding academy"
for (i,word) in enumerate(mystring.split()):
if (i % 2 == 0):
print(word, end=' ');
else:
print("_____",end=' ');
print()
Note that enumerate returns us two values, as before. This time we are checking the index to see if it is even or odd and based on that result we either print or hide the word. We are also using end= in the print() function to print the sentence on the same line. The output looks like:
Kodeclik _____ a _____ coding _____
Creating dressing combinations
Suppose you have a rich wardrobe and are deciding on your outfit for the weekend. You can do something like:
tops = ["green","blue","white"]
bottoms = ["red","black","brown"]
shoes = ["blue","black","gray"]
for t in tops:
for b in bottoms:
for s in shoes:
print(t,b,s)
As you can see you are iterating through each combination and printing them in detail. In total you have 3x3x3 = 27 combinations.
Suppose we want to list each combination with a number. Here’s a first attempt:
tops = ["green","blue","white"]
bottoms = ["red","black","brown"]
shoes = ["blue","black","gray"]
for (i,t) in enumerate(tops,start=1):
for (j,b) in enumerate(bottoms,start=1):
for (k,s) in enumerate(shoes,start=1):
print(i+j+k,t,b,s)
This yields:
3 green red blue
4 green red black
5 green red gray
4 green black blue
5 green black black
6 green black gray
5 green brown blue
…
Oops! What went wrong? Every index starts with 1 but we need to multiply each index with the right offset in order to find the right number. Here’s the solution (we will leave it to you to understand how this works):
tops = ["green","blue","white"]
bottoms = ["red","black","brown"]
shoes = ["blue","black","gray"]
for (i,t) in enumerate(tops):
for (j,b) in enumerate(bottoms):
for (k,s) in enumerate(shoes):
print((i*len(tops)*len(bottoms)+j*len(bottoms)+k+1),t,b,s)
The result is:
1 green red blue
2 green red black
3 green red gray
4 green black blue
5 green black black
6 green black gray
7 green brown blue
8 green brown black
9 green brown gray
….
Printing statistics about months
Here’s a simple program that prints statistics about months using the enumerate() function:
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
for (i,m) in enumerate(months, start=1):
print(m,"is month",i)
The output is:
January is month 1
February is month 2
March is month 3
April is month 4
…
There’s a different way to do it using the Python zip function.
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
indices = range(1,13)
for (i,m) in zip(indices, months):
print(m,"is month",i)
The output is:
January is month 1
February is month 2
March is month 3
April is month 4
May is month 5
June is month 6
July is month 7
August is month 8
September is month 9
October is month 10
November is month 11
December is month 12
What does zip do? zip() is essentially allowing you to have a common index and iterate through multiple iterables simultaneously. In other words, at each step of the loop, the index is moved forward by one step in two iterables. Here both iterables are lists. The first list is a list of months and the second is a list of numbers (the number of the month). Without zip, if we were to use nested for loops, we will have iterated over every combination of months and month-indices.
Now you might not like the 13 in the above code as it looks awfully specific. Instead you could have written:
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
indices = range(1,len(months))
for (i,m) in zip(indices, months):
print(m,"is month",i)
This yields:
January is month 1
February is month 2
March is month 3
April is month 4
May is month 5
June is month 6
July is month 7
August is month 8
September is month 9
October is month 10
November is month 11
Oops - what happened? Why did December not get printed? That is because remember in the range() function, the second argument needs to be one more than what you need. The way it is currently written the first list has 12 elements and the second list has 11 elements so the zip() function can iterate only over the 11 elements because the 12th element is missing in the second list. To fix this:
Here is a solution that fixes this problem:
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
indices = range(1,len(months)+1)
for (i,m) in zip(indices, months):
print(m,"is month",i)
Here’s a more complicated example with zip involving 3 iterables:
months = ["January", "February", "March", "April", "May", "June", "July", "August", "Septemnber", "October", "November", "December"]
days = [31,28,31,30,31,30,31,31,30,31,30,31]
for (i,n,m) in zip(days,list(range(1,13)),months):
print(m,"is month",n,"and has",i,"days.")
The output is:
January is month 1 and has 31 days.
February is month 2 and has 28 days.
March is month 3 and has 31 days.
April is month 4 and has 30 days.
May is month 5 and has 31 days.
June is month 6 and has 30 days.
July is month 7 and has 31 days.
August is month 8 and has 31 days.
Septemnber is month 9 and has 30 days.
October is month 10 and has 31 days.
November is month 11 and has 30 days.
December is month 12 and has 31 days.
Mini Project: Build your own book search engine
Let's build our own search engine using what we have learnt so far. We will take a book, parse it into words, and try to determine the locations of specific words. (So our search engine will work only for this one book.) Project Guttenberg is a library of free books. You can choose any book here but for demonstration purposes, we are going to use Peter Pan by J.M. Barie.
Here’s a simple Python program to enumerate all the words and their locations:
book = open("peterpan.txt", "r")
content = book.read()
for (i,word) in enumerate(content.split()):
print(i, word)
This yields (only an excerpt in the middle shown):
…..
17589 Peter
17590 did
17591 not
17592 hear
17593 him.
….
Note that we are not worrying about punctuation, white space, and in general cleaning up the text before indexing it. (You will need to do this to create a robust search engine).
Let us now search for “Peter” in “Peter Pan” (duh!).
book = open("peterpan.txt", "r")
content = book.read()
query="Peter"
for (i,word) in enumerate(content.split()):
if (word == query):
print(i, word)
This gives:
5 Peter
94 Peter
96 Peter
136 Peter
1607 Peter
1614 Peter
2337 Peter
2360 Peter
2632 Peter
….
The above output means Peter occurs in index position 5, then in position 94, then in position 96, and so on. If we desire to make things more interesting and create a more contextual search engine, we can first create a dictionary which serves as an index of the positions and words and then when we find the query “Peter” in it, we can print not just the location where we found it but also the words before and after it.
First, let us create a dictionary.
book = open("peterpan.txt", "r")
content = book.read()
myindex = {}
for (i,word) in enumerate(content.split()):
myindex[i] = word
Note that we are initializing the dictionary myindex to be empty and then populating it as we read through the file. Then we can do:
query="Peter"
for i in myindex:
if (myindex[i] == query):
print(i-1,myindex[i-1],query,myindex[i+1])
Note that we are printing the words before and after the query and the location where we find the first word (the word before the query). This yields:
….
7519 but Peter did
7681 asked Peter how
8162 and Peter made
….
25658 sound Peter heard
25845 reach Peter on
25881 time Peter recognised
…..
Isn’t this fun? We will leave it to you to make your search engine more interesting. For instance, print more words before and after, and also allow the user to search for multiple keywords rather than a single keyword.
In this blogpost we have learnt about the enumerate() function (that allows you to loop over entries and gives access to the indices as well), iterables (which refers to data structures that afford the enumerate() function), and useful programs you can build with them. For another perspective on the enumerate function, see our blogpost on how to count in a Python loop and the Python next() function, and some applications of loop counting in Python. Checkout also our blogpost on control structures in Python. Also learn about reversing a range in Python. More details about the zip() function and similar capabilities can be seen in our blogpost on how to iterate through multiple lists simultaneously. Learn more about tuples and index errors you might encounter in this blogpost about Python Tuple IndexError. Learn more about useful features of Python in this blog post about random number generation. Also see our blog post on bubble sort using Python and learn the difference between arrays vs lists in Python.
Want to learn Python with us? Sign up for 1:1 or small group classes.