Zipping Lists in Python


Zipping Lists in Python

If you need to work with multiple lists at once the zip builtin is the tool you are looking for.

The zip function is a Python builtin that aggregates its arguments into a list of tuples.

What is the zip function and how does it work?

zip takes an iterable (with zero or more elements) and aggregates them into a single list. This list will contain tuples, where the tuples contain an element from each list.

Typically it is used with two lists, like so:

for i in zip([1,2,3], ['a','b','c']):
    print(i)

Output:

(1, 'a')
(2, 'b')
(3, 'c')

Type of the return value

In Python version 2 zip returns a list, while in Python 3, the return value is an iterator:

Python 2

print(zip([1,2,3], ['a','b','c']))

output:

[(1, 'a'), (2, 'b'), (3, 'c')]

Python 3

print(zip([1,2,3], ['a','b','c']))

output:

<zip object at 0x7f0b1bce7500>

How to import zip

zip is a python builtin, it can be used to work with lists in parallel, no need to import, it should be accessible by default as it is defined in the builtin namespace.

What happens if the lists are not of equal length?

If you pass lists of different lengths, the return value will have the length of the shortest list:

print(zip([1,2,3], ['a','b']))
<zip object at 0x7f0b1bce7500>

output:

[(1, 'a'), (2, 'b')]

Advanced use cases - tricks using zip

zip is can be surprisingly powerful tool, here are some nice patterns to showcase what you can do with it:

Zipping multiples lists

As I’ve already mentioned, zip is not limited to zipping only two lists, in fact, you can pass it any number of arguments. The result will create tuples which have the length equal to the number of arguments passed to zip. This pattern is used quite often in Python:

for i in zip([1,2,3], ['a','b','c'], [0.1, 0.2, 0.3]):
    print(i)

output:

(1, 'a', 0.1)
(2, 'b', 0.2)
(3, 'c', 0.3)

Iterating two lists in parallel

Python has a feature called tuple unpacking, which means that you can assign values to multiple variables in a single expression:

a, b, c = (1, 2, 3)
print(a)
print(b)
print(c)

output:

1
2
3

This can be used in combination with zip to iterate on two (or more) lists in parallel:

for i, j, k in zip([1,2,3], ['a','b','c'], [0.1, 0.2, 0.3]):
    print("{}, {}, {}".format(i, j, k))

output:

1, 'a', 0.1
2, 'b', 0.2
3, 'c', 0.3

Combining two lists into a dictionary (keys+values)

A dictionary can be defined using a list of 2-long tuples (pairs), so you can easily zip together two lists.

keys = ['a','b','c']
values = [1, 2, 3]
print(dict(zip(keys, values)))

output:

{'a': 1, 'b': 2, 'c': 3}

Zipping a list of lists

If you have a list of lists (in other words: a variable number of lists), you can zip them together as well, you only need to unpack them using the * operator:

l=[[1,2,3], ['a','b','c'], [0.1, 0.2, 0.3]]
for i in zip(*l):
    print(i)

output:

(1, 'a', 0.1)
(2, 'b', 0.2)
(3, 'c', 0.3)

The inverse of zip - unzipping a list of tuples into multiple lists

The above pattern can be used to revert zipping a list of lists as well.

zipped_list = [
    (1, 'a', 0.1)
    (2, 'b', 0.2)
    (3, 'c', 0.3)
]
first_elements, second_elements, third_elements = zip(*zipped_list)
print(first_elements)

output:

[1, 2, 3]

Transposing a matrix

Generalizing this means that pattern shown above (unpack+zip) can be used to transpose martixes:

matrix=[[1,2,3],[4,5,6],[7,8,9]]
print(list(zip(*matrix)))

output:

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Returning a list of lists instead of a list of tuples

There is a slight issue with the example above, as the list of list turned into a list of tuples. It can be easily fixed though, you can turn lists into tuples in a list comprehension:

matrix=[[1,2,3],[4,5,6],[7,8,9]]
print([list(row) for row in zip(*matrix)])

output:

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Zip longest

I’ve already mentioned, that zip will return a list that is equal in length with the shortest of its arguments. There is very similar function defined in the itertools module, called zip_longest, which behaves exactly like zip, but pads the resulting tuples to the length of the longest of its arguments.

from itertools import zip_longest
list(zip_longest(['a', 'b', 'c'], [1]))

output:

[('a', 1), ('b', None), ('c', None)]

You can also set a different padding value instead of None using the fillvalue keyword argument:

from itertools import zip_longest
list(zip_longest(['a', 'b', 'c'], [1], fillvalue='X'))

output:

[('a', 1), ('b', 'X'), ('c', 'X')]