As you may know, there are four main data collections in Python: a list, tuple, set, and dictionary. It may seem a little overwhelming and complex at first, but a good Python developer needs to have a good grasp of data structures because they are the program's building blocks. This topic will point out the differences and similarities between data collections and help you decide which suits you better.
Properties to look at
Different data collections have different properties. Let’s briefly recall them.
-
How elements are stored: both lists and tuples can store elements in a single row or several rows and columns, so they allow stacking. Sets, on the other hand, store elements only in a single row. Dictionaries store elements in key-value pairs.
-
Duplicates: lists and tuples can have duplicate elements. In dictionaries, keys cannot repeat themselves, while values can be the same for different keys. Finally, sets cannot have two items with the same value.
-
Mutability: except tuples, all other collections are mutable. We can change the stored elements.
-
Order: all data structures excluding sets are ordered, so the elements preserve their order.
Unfortunately, dictionaries became ordered only in Python 3.7. If you are using earlier versions of Python, there is a dictionary subclassOrderedDictthat remembers the order of added entries. You can find it in thecollectionsmodule. -
Indexing: in lists and tuples, we can access the constituent elements through indexing and retrieve indexes of the required elements. Dictionaries and sets cannot be indexed. However, you can use the
list()constructor. It takes an iterable (a set, a dictionary, or a string) as a parameter and creates a list of iterable items.
Below you can find a comparison table that can help with the decision.
|
List |
Tuple |
Set |
Dictionary |
|
|
Mutable |
+ |
- |
+ |
+ |
|
Duplicates |
+ |
+ |
- |
(-) keys (+) values |
|
Ordered |
+ |
+ |
- |
+ |
|
Indexing/Slicing |
+ |
+ |
- |
- |
|
Storage type |
row/column |
row/column |
row |
key-value |
Questions to ask yourself
There are several questions you can ask yourself that will help you make the right decision.
-
Am I going to perform membership tests? For example, you want to write a program that will consider a user input valid only if it is a part of a collection, otherwise it is invalid. In this case, use a set. In general, sets are preferred over lists when you want to do lookups because sets are implemented as hash-tables. They perform membership tests faster, especially when you have a lot of data. Here is an example of when it's a good idea to use a set:
valid_answers = {"y", "n"} answer = None while answer not in valid_answers: answer = input("Continue? [y, n] ") -
Is the order important? If so, use a list. As you know, lists can be indexed, so it is possible to find an element by its position or find the position by the name of the element:
names = ['John', 'Mary', 'Alex', 'Alice'] print(names[1]) # Mary print(names.index('Alice')) # 3 -
But what if the positions are important and the values are fixed? As you may have noticed, the
nameslist above is mutable, we can expand it or remove some elements. What if we do not want that? Then, choose a tuple over a list. Not only are tuples immutable, but they are also iterated over faster than lists with large amounts of data. For example, let’s assume we have a tuple of weekdays, and we want it to stay the same all the time:days = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday') print(days[1]) # Tuesday -
What if I want to extract a value of an element not by its index but by its key? Then, a dictionary is the best choice. Take a mapping between a person’s name and their age:
names_age = {"John": 22, "Mary": 35, "Alex": 16, "Alice": 54} print(names_age['Mary']) # 35
Conclusion
In this topic, we have discussed the various properties of four main Python collections. We also have taken a look at various questions that can help you with choosing a data structure. To sum up, when choosing the right collection, think about which elements you are going to store, how you want to access elements, and whether you want the collection to be mutable. Practice makes perfect. After a little while, you will be able to choose a collection without a doubt.