You are already familiar with the hashCode() and equals() methods and know why you need to override them in custom classes. Now it's time to learn about their role in classes based on hash tables. We will explore it with the example of HashMap and HashSet. Seeing how these classes behave if you don't override hashCode() and equals() will help you understand why it is so important to do that.
The role of hashCode()
Before we get down to business, let's create a class that we will use for further examples.
class Person {
private String name;
public Person(String name) {
this.name = name;
}
// getter and setter
@Override
public String toString() {
return "Person{" +
"name='" + name + '\'' +
'}';
}
}
It's pretty simple, with just one field, but that's enough for us. Now let's create a map and add an entry to it.
Map<Person, Integer> map = new HashMap();
map.put(new Person("James Gosling"), 1995);
Do you remember how entries are recorded inside the map? This might refresh your memory:
If we suppose that new Person("James Gosling") hash code is 1010101010, it will be located under bucket number 2. Now let's try to access the entry:
Map<Person, Integer> map = new HashMap();
map.put(new Person("James Gosling"), 1995);
System.out.println(map.get(new Person("James Gosling"))); // null
If you don't override the hashCode() in this operation, the System.identityHashCode(Object x) method will be used by default which calculates the hash code based on the object identity (its physical address). In this case, each object will have a different hash code, even if two objects are logically equal. Since we searched the entry by the other object, the result is null.
Objects.hash(Object... values) static method which calculates the hash code based on object field values. IntelliJ IDEA uses exactly this method in the autogenerated hashCode() method. A well-implemented algorithm reduces the number of collisions and a poorly implemented one increases it. For example, you can override hashCode() so that it returns a constant value. In this case, all the elements will be stored in one bucket, depriving us of the most important property thanks to which HashMap is so efficient.Now, let's see how HashSet will behave if we don't override the
hashCode()
method. As you know, HashSet is a collection of unique elements but in this case, it won't behave as expected:
Set<Person> set = new HashSet();
set.add(new Person("James Gosling"));
set.add(new Person("James Gosling"));
System.out.println(set); // [Person{name='James Gosling'}, Person{name='James Gosling'}]
This Set will store both elements since they have different hash codes and are considered different elements. However, overriding hashCode() alone won't solve the issue: the equals() method is just as important. You'll understand why in the next section.
The role of equals()
Imagine we have the same Person class including the overridden hashCode() method and the following code:
Map<Person, Integer> map = new HashMap();
map.put(new Person("James Gosling"), 1995);
System.out.println(map.get(new Person("James Gosling"))); // null
To understand what result you will get it is important to know what happens when you search for an entry. First, the hash code of the object is calculated to find out in which bucket the entry is located. Then a key with that hash code is searched in the corresponding bucket and, once found, it is checked for equality by the equals() method.
In our case, when performing the get operation we will find the correct entry location but the result will be null since equals() checks objects by identity using the == operator by default. In the case above, with both methods overridden, the result would be 1995. That's why it is so important to override this method as well.
Now, let's test the code from the previous section demonstrating the HashSet behavior with overridden hashCode() and equals() methods:
Set<Person> set = new HashSet();
set.add(new Person("James Gosling"));
set.add(new Person("James Gosling"));
System.out.println(set); // [Person{name='James Gosling'}]
Everything works fine and as expected. The application will print only one element since HashSet recognized the duplicate element.
Key mutations
Let's explore the issue we will face if the state of the object changes during the execution of the program. Consider this situation where hashcode() and equals() are overridden:
Person james = new Person("James Gosling");
Map<Person, Integer> map = new HashMap();
map.put(james, 1995);
james.setName("J. Gosling");
System.out.println(map.get(new Person("James Gosling"))); // null
System.out.println(map.get(new Person("J. Gosling"))); // null
As you know, by default, the hash code is calculated based on the object identity, but the overridden hashCode() based on Objects.hash(Object... values) calculates it using object field values. So, on the line 4 we add an entry that we assume is stored under bucket 2. In the next step, we change the name field value, which should have another hash code and another bucket location, but the entry remains under the same old bucket.
On the last two lines, we are trying to access an entry but in both cases, the result is null, and here is why. At first, searching by
new Person("James Gosling") we find the correct bucket, but checking new Person("James Gosling").equals(new Person("J. Gosling")) returns false. On the next line, the hash code calculation returns the bucket number where we don't have anything stored.
This is why you should avoid changing the state of the object after you used it as a key, otherwise, you will lose access to the entry. This kind of situation is commonly referred to as a memory leak.
We'd face a similar problem with HashSet:
Person james = new Person("James Gosling");
Set<Person> set = new HashSet();
set.add(james);
james.setName("J. Gosling");
set.add(new Person("J. Gosling"));
System.out.println(set); // [Person{name='J. Gosling'}, Person{name='J. Gosling'}]
This code performs with the same issue as the previous example concerning a HashMap in this section. We will give you a chance to share your explanation in the comments. If you feel like you know the correct answer and want to share it with other learners, you are welcome to do so.
Conclusion
In this topic, we have explored the behavior of two hash table-based classes, HashMap and HashSet, and the importance of overriding the hashCode() and equals() methods when working with them. In your practice, you will encounter such cases a lot. Make sure you have mastered the topic completely!