10 minutes read

How often did you have to write the tedious __init__, __repr__, and other standard class methods? Countless times. If you are bored with repeatedly writing the same thing, know there's a solution – a data class! A data class is a new feature in Python 3.7+ that allows you to create classes straightforwardly. It doesn't add any new functionality but automatizes some routine work and helps you get the same thing but with fewer lines of code. So, let's dive into it!

Creating a class

For this topic, ensure you have Python version 3.7 or higher installed. Otherwise, the code won't work.

As we've mentioned, the purpose of a data class is to create classes with less code. To see it in practice, let's first create a class via standard procedure:

class OldClass:
    def __init__(self, name, number):
        self.name = name
        self.number = number

Now, let's do the same with the data class. First, we import it from the dataclasses module, then use the @dataclass decorator and write our class. Here's the code:

from dataclasses import dataclass

@dataclass
class NewClass:
    name: str
    number: int

We have the exact attributes there: name and number, but we don't need to add self. We don't even need def __init__()! But we have to specify the data type (str and int in this case), otherwise a NameError is raised.

In these examples, OldClass and NewClass behave the same. However, NewClass is much faster to write!

If you want to add default values, you can do the following:

from dataclasses import dataclass

@dataclass
class NewClass:
    name: str = 'Class name'
    number: int = 1

If you have many attributes with default values and some without, the attributes without default values should be placed first.

With data classes, you don't need to write the __init__ method, since it is implemented automatically. In total, there are four methods created automatically in every data class:

  • __init__ to initialize the object;

  • __repr__ and __str__ to create a string representation of your class;

  • __eq__ to compare class objects to each other.

We'll learn more about them in the next sections.

String representations

After we've defined a class, create class instances in the usual way:

new = NewClass("name", 10)

As you remember, we haven't written the __repr__ method. Still, we get a neat representation of the class when we use print():

print(new)  # NewClass(name="name", number=10)

You can constantly redefine this method if you want to print something else. Just do it the usual way. Note that when you write any methods inside a data class, you need to use self like in any other class.

from dataclasses import dataclass

@dataclass
class NewClass:
    name: str
    number: int
        
    def __repr__(self):
        return f"The name is {self.name} and the number is {self.number}."

If we try to print our new class object now, here's what we get:

print(new)  # The name is name and the number is 10.

So, if you define __repr__ or __str__, the standard ones would be overridden, and your implementation would be used. The same goes for the __eq__ method.

Equality check

The __eq__ method compares class objects to each other. For standard class definition, you need to add it; otherwise, you'll end up with something like this:

class OldClass:
    def __init__(self, name, number):
        self.name = name
        self.number = number

old_1 = OldClass('name', 10)
old_2 = OldClass('name', 10)
old_1 == old_2  # False

Now, let's try it with dataclass:

from dataclasses import dataclass

@dataclass
class NewClass:
    name: str
    number: int

new_1 = NewClass('name', 10)
new_2 = NewClass('name', 10)
new_1 == new_2  # True

We haven't written the __eq__ method in either of the examples, but with dataclass, it's implemented automatically — if all the attributes are equal, class instances are also equal. If you want a different comparison logic, write your __eq__ method, which will be used instead of the default one.

Comparison and sorting

Now, we know how to determine if two class instances are equal. But what if you wish to compare them using a > operator? For now, it would result in an error:

# we're using the NewClass defined in the previous code snippet

new_1 > new_2  # TypeError: '>' not supported between instances of 'NewClass' and 'NewClass'

However, if you want to implement a comparison by an attribute, it won't take many lines. Do three simple things: specify order=True when using the decorator, write a sort_index attribute, and a __post_init__ method. Let's compare our class objects by the number attribute. Here's what the code looks like:

from dataclasses import dataclass, field

@dataclass(order=True)
class NewClass:
    sort_index: int = field(init=False)
    name: str
    number: int
    
    def __post_init__(self):
        self.sort_index = self.number

Let's go through this code in more detail. First, we import not just dataclass but field too – it's a function we'll need in the next lines. Then, we specify order=True in the decorator. We create the sort_index attribute and set it to int (since the number attribute should be an int) and make it field. In dataclass, field() is just another way to create a class attribute, specifying more information. In this case, we require field() to specify init=False. This means that sort_index won't be initialized when a class instance is created. Instead, it will be initialized in the __post_init__ method, the one that is called right after the __init__. Inside the __post_init__ we set self.sort_index to self.number, specifying that we want to compare our class instances by the number attribute.

Now, let's create some class objects and try to compare them:

new_1 = NewClass('name', 10)
new_2 = NewClass('name', 0)
new_3 = NewClass('name', 5)

new_1 > new_2  # True
new_2 > new_3  # False

As you can see, it works perfectly! What's more, we can also sort our objects now.

objects_list = [new_1, new_2, new_3]
sorted(objects_list)

Here's the resulting list:

[NewClass(sort_index=0, name='name', number=0),
 NewClass(sort_index=5, name='name', number=5),
 NewClass(sort_index=10, name='name', number=10)]

Note that what we see here are the default string representations. If you want to avoid including the sort_index in our representation, which might be redundant, as it has the same value as number, we can do it the following way:

sort_index: int = field(init=False, repr=False)

Adding repr=False when defining a class attribute, this attribute won't be included in the string representation.

Frozen objects

The last feature of the data class we'll look at is creating frozen objects. Let's say you want to prohibit changing the attributes of an object after its creation. Easy! Just add frozen=True to the decorator like this:

@dataclass(frozen=True)
class NewClass:
    name: str
    number: int

Now, let's try to create a class object and then reassign the name variable.

new = NewClass('name', 10)
new.name = 'another_name'  # FrozenInstanceError: cannot assign to field 'name'

You get an error. Why would you want your objects to be immutable? There can be many reasons, but the main ones are having less side effects in your code and just not having to think about what happens if this or that attribute gets changed later. So, if you need some read-only class objects – go for frozen data classes!

Conclusion

In this topic, we've covered the main features of dataclass – a new way of writing classes in Python 3.7+. Let's make a quick recap of the main points:

  • Data classes offer an easier way of creating classes;

  • The functionality that a data class provides is not unique – the same things can be achieved with the regular classes, but they might require more code;

  • String representation is automatically generated in a data class;

  • Equality check is implemented automatically, too; two data classes are considered equal if all their attributes are the same;

  • You can freeze your objects to prohibit assigning new values to the attributes.

Of course, data classes provide many more possibilities; use the official documentation to learn more about that. Now, let's turn to practice!

32 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo