Parallel iteration is the way to access elements of more than one iterable simultaneously. This can be useful when you want to combine two data structures or when you want to access more than one iterable element in one for loop. In this topic, we will learn how to do all these things using the built-in zip() function.
The zip() function
Zip() takes multiple iterables and zips them together into one iterator, just like a zipper that binds the interlocking teeth of a zip.
The function can take any number of iterables and then returns an iterator of tuples. Each generated tuple contains one element from every iterable that was provided to zip(). Let's look at the following example of passing two lists:
numbers = [1, 2, 3]
words = ['one', 'two', 'three']
zip_iterator = zip(numbers, words)
print(list(zip_iterator))
# [(1, 'one'), (2, 'two'), (3, 'three')]
Take a look at how zip() takes the first element of numbers, the first element of words and zips them together into a tuple. It does the same thing for every other pair of elements. Remember, zip() returns an iterator, so we need to convert it to a list first to print the whole output.
You can pass any iterable to zip(). Mind the following snippet:
my_string = 'AFK'
my_tuple = ('Away', 'From', 'Keyboard')
zip_iterator = zip(my_string, my_tuple) # Returns an iterator
print(list(zip_iterator))
# [('A', 'Away'), ('F', 'From'), ('K', 'Keyboard')]
Using zip() with unordered iterables like sets will return tuples that are paired up in random order.
Iterables with different lengths
If you pass iterables of different lengths to zip(), the number of the resulting tuples will always be equal to the shortest iterable. Any remaining data in the longer iterables will be discarded:
shortest = {'this', 'is', 'a', 'set'}
longest = [True, True, True, True, False, False]
zip_iterator = zip(shortest, longest)
print(list(zip_iterator))
# [('a', True), ('set', True), ('this', True), ('is', True)]
In the above example, we have established a zip() function with two data types: a set of four elements and a list of six. The first 4 True values have been zipped together with the shorter set, but the False values have been discarded, as they have no pair. Also, note the order — it is different!
You can pass only one iterable to zip(). In this case, the resulting iterator will contain single element tuples:
solo_string = 'Han Solo'
zip_iterator = zip(solo_string)
print(list(zip_iterator))
# [('H',), ('a',), ('n',), (' ',), ('S',), ('o',), ('l',), ('o',)]Zip() in loops
One of the most common uses of zip() is to loop over multiple iterables simultaneously. It is called a parallel iteration. We can achieve this by using a for loop over an iterator generated by zip():
planets = ['Earth', 'The Moon', 'Mars']
colors = ['blue', 'gray', 'red']
visited = [True, True, False]
for planet, color, visit in zip(planets, colors, visited):
print(f'{planet} is {color}')
print(f'Visited = {visit}')
# Earth is blue
# Visited = True
# The Moon is gray
# Visited = True
# Mars is red
# Visited = False
In the example above, we feed zip() three lists, and it returns an iterator of tuples. Each tuple has three elements, one from each list. In the for loop header, we define a variable for each element in the generated tuple and then use them inside our print statements. The loop then goes over each tuple in the iterator one at a time and executes all the print statements. Here you can find the code visualization that can help you to better understand how it works.
Parallel iteration with dictionaries
Parallel iteration is a little more tricky with dictionaries. By default, zip() will only generate tuples containing a dictionary's keys. To include both keys and values, you must use the .items() method in the dictionary:
hero = {'name': 'Peter', 'age': 13}
villain = {'name': 'Hook', 'age': 41}
zipped = zip(hero.items(), villain.items())
print(list(zipped))
# [(('name', 'Peter'), ('name', 'Hook')), (('age', 13), ('age', 41))]
zipped = zip(hero.items(), villain.items())
for (hero_key, hero_value), (villain_key, villain_value) in zipped:
print(f"The hero's {hero_key} is {hero_value}")
print(f"The villain's {villain_key} is {villain_value}")
# The hero's name is Peter
# The villain's name is Hook
# The hero's age is 13
# The villain's age is 41
In the above example, we use .items() to convert two dictionaries into item lists. Each list includes key-value tuple pairs. These lists are then zipped together to form an iterator where each generated tuple contains two nested tuples, one key-value pair from hero and one from villain. Then we assign our for loop variables in the same nested pattern and loop through the iterator.
For clarity, we have printed our zip iterator as a list. However, this exhausts the iterator of all its tuples, so it needs to be defined again before being used in the for loop.
Unzipping
Unzipping a sequence is as simple as adding an unpacking operator to an iterable provided to zip().
phrase = [('A', 'Away'), ('F', 'From'), ('K', 'Keyboard')]
unzipped = zip(*phrase)
print(list(unzipped))
# [('A', 'F', 'K'), ('Away', 'From', 'Keyboard')]
It unpacks phrase into three separate tuples, and after that, zip() combines back together in parallel, just like any other group of iterables.
Conclusion
Parallel iteration is very useful when you work with different data structures. We have learned how to use zip() and combine iterables to iterate over them in parallel, as well as how to unzip these iterables further. Time to practice with a few examples!