In some situations, you need to store pairs of associated objects. For example, when counting the number of words in a text, the first one is a word, and the second one is the number of its occurrences in the text. There is a special type of collection called map to effectively store such pairs of objects. A map is a collection of key-value pairs. Keys are always unique, while values can repeat.

A good example of a map from the real world is an employee record book where employee IDs are keys and employee names are values.

Keys  : Values
-----------------------
Bob   : +1-202-555-0118
James : +1-202-555-0220
Katy  : +1-202-555-0175

Maps have some similarities with sets and arrays;

keys of a map form a set, but each key has an associated value;
keys of a map are similar to indexes of an array, but the keys can have any type including integer numbers, strings and so on.

Due to these reasons, you can encounter some kind of deja vu effect when learning maps.

Next, all our examples will use string and numbers as keys since using custom classes as types of keys have some significant complexity that we need to consider.

The Map interface

The Collections Framework provides the Map<K,V> interface to represent a map as an abstract data type. Here, K is a type of keys, and V is a type of associated values. The Map interface is not a subtype of the Collection interface, but maps are often considered as collections since they are part of the framework.

The interface declares a lot of methods to work with maps. Some of the methods are similar to methods of Collection, while others are unique to maps.

1) Collection-like methods:

int size() returns the number of elements in the map;
boolean isEmpty() returns true if the map does not contain elements and false otherwise;
void clear() removes all elements from the map.

We hope these methods do not need any comments.

2) Keys and values processing:

V put(K key, V value) associates the specified value with the specified key and returns the previously associated value with this key or null;
V get(Object key) returns the value associated with the key, or null otherwise;
V remove(Object key) removes the mapping for a key from the map;
boolean containsKey(Object key) returns true if the map contains the specified key;
boolean containsValue(Object value) returns true if the map contains the specified value.

These methods are similar to the methods of collections, except they process key-value pairs.

3) Advanced methods:

V putIfAbsent(K key, V value) puts the value if the specified key is not already associated with a value (or is mapped to null) and returns null, otherwise, returns the current value;
V getOrDefault(Object key, V defaultValue) returns the value to which the specified key is mapped, or defaultValue if this map contains no mapping for the key.

These methods, together with some others, are often used in real projects.

4) Methods which return other collections:

Set<K> keySet() returns a Set view of the keys contained in this map;
Collection<V> values() returns a Collection view of the values contained in this map;
Set<Map.Entry<K, V>> entrySet() returns a Set view of the entries (associations) contained in this map.

This is not even a complete list of methods since Map is a really huge interface. So reading the documentation can be really helpful when using maps.

To start using a map, you need to instantiate one of its implementations: HashMap, TreeMap, or LinkedHashMap. They use different rules for ordering elements and have some additional methods. There are also immutable maps.

Immutable maps

The simplest way to create a map is to invoke the of method of the Map interface. The method takes zero or any even number of arguments in the format key1, value1, key2, value2, ... and returns an immutable map.

Map<String, String> emptyMap = Map.of();

Map<String, String> friendPhones = Map.of(
        "Bob", "+1-202-555-0118",
        "James", "+1-202-555-0220",
        "Katy", "+1-202-555-0175"
);

Now let's consider some operations that can be applied to immutable maps using our example with employeeBook.

The size of a map equals the number of pairs contained in it.

System.out.println(emptyMap.size());     // 0
System.out.println(friendPhones.size()); // 3

It is possible to get a value from a map by its key:

String bobPhone = friendPhones.get("Bob"); // +1-202-555-0118
String alicePhone = friendPhones.get("Alice"); // null
String phone = friendPhones.getOrDefault("Alex", "Unknown phone"); // Unknown phone

Note that the getOrDefault method provides a simple way to prevent NullPointerException since it avoids null.

It is also possible to check whether a map contains a particular key or value by using containsKey and containsValue methods.

We can directly access the set of keys and collection of values from a map:

System.out.println(employeeBook.keySet()); // [07, 43, 98]
System.out.println(employeeBook.values()); // [Bob, James, Katy]

Since it is immutable, only methods that do not change the elements of this map will work. Others will throw the UnsupportedOperationException exception. If you'd like to put or remove elements, we suggest you use one of HashMap, TreeMap or LinkedHashMap.

HashMap

The HashMap class represents a map backed by a hash table. This implementation provides constant-time performance for get and put methods assuming the hash function distributes the elements uniformly among the buckets.

The following example demonstrates a map of products where the key is the product code, and the value is the name.

Map<Integer, String> products = new HashMap<>();

products.put(1000, "Notebook");
products.put(2000, "Phone");
products.put(3000, "Keyboard");

System.out.println(products); // {2000=Phone, 1000=Notebook, 3000=Keyboard}
System.out.println(products.get(1000)); // Notebook

products.remove(1000);
System.out.println(products.get(1000)); // null

products.putIfAbsent(3000, "Mouse"); // it does not change the current element
System.out.println(products.get(3000)); // Keyboard

This implementation is often used in practice since it is highly-optimized for putting and getting data pairs.

There are different ways to create a map filled with key-value pairs. It's time to take a good look at the hash map: an effective instrument that is probably the most popular implementation of the Map interface. It provides very fast access to its elements and falls within the top ten questions asked at Java job interviews.

How does it work?

The HashMap class represents a map backed by a hash table. To understand what it is, let's start with the internal organization of HashMap.

If you look inside HashMap you will see that it's an array with buckets as its elements. The default size of such an array is 16.

What could be a "bucket" in a programming language? Actually, each bucket is a linked list, meaning there are 16 linked lists for storing key-value pairs. And each key-value pair is an element of a linked list.

Now we need a special structure to store the key-value pairs as well. This structure is a class called Node. It has four fields: the first one is hash, then there's our key and value, and the last one contains a link to the next pair.

Now you know how HashMap looks inside and are ready to learn what is happening when you put an element in it. Any ideas on how a HashMap decides where to put your key-value pair?

Here's what happens: first, it generates the hash code of your key. Next, it generates the index of the bucket, which depends on the calculated hash.
In the formula below n is the number of buckets:

index = hash & (n - 1)

Alternatively, you can calculate the index as follows:

index = hash % n

It will give the same result, but works slower than a bitwise operation. Also, you should know that n is always a power of 2. This is guaranteed by the tableSizeFor method inside the HashMap class.

First, let's think how the % operation works when n is a power of 2 and hash is a positive whole number. For clarity, we'll set n as 4:

As you can see, this mod operation removes all the bits that are more significant or equal than n (here n = 00000100).

On the other hand, what happens in the bit world when you do hash & (n - 1)? Again, you remove more significant bits, because
n - 1 always looks like 00000011 (zeros followed by ones).

Now that you've mastered the theory of the HashMap organization, it's time to consider some examples.

Thanks to the hash function, this implementation provides constant-time performance for get and put methods! Constant-time means O(1) complexity.

Now that you've mastered the theory of the HashMap organization, it's time to consider some examples.

Time to play!

Visual depiction of the characters hashmap

To get a similar debugging view: right click on the map and choose "View as" —> "Object". Then right click on the element and choose "View as" —> "Object". Clicks don't depend on each other.

Map<Integer, String> characters = new HashMap<>();

characters.put(1000, "Cinderella");

Here we've created a HashMap called characters and added Cinderella in it. There are exactly 16 buckets. The class has calculated that the index of the new bucket would be 8.

In the picture, you can see that Cinderella is the first Node of the 8th linked list. As we've said earlier, Node has four fields. In our example, the hash code and the key are the same. For Integer class hash code is the same as its value. Note that every class has different logic for calculating hash code. You can override the method hashCode as well. Also, it's good to know that it is possible to create a pair in which the key would be null. The hash code of the null-key is always zero. Now let's add more characters and see what will happen:

characters.put(2000, "Prince");
characters.put(3000, "Evil stepmother");

// {2000=Prince, 1000=Cinderella, 3000=Evil stepmother}
System.out.println(characters);

Character Prince decided to be in the bucket with index 0. He is also the first Node in his linked list. But a stepmother can't be far away from poor Cinderella! Evil stepmother is also staying in the bucket with the number 8. As you can see, the field next in Cinderella's Node has changed. Now the field has a link to the next character, Evil stepmother. Because the Evil stepmother became the last one in this linked list, her field next has value null.

Let's try to remove an element from the characters HashMap:

characters.remove(3000); // get rid of Evil stepmother

System.out.println(characters.get(3000)); // null

The element Evil stepmother was deleted. Cinderella became the first and the last element of the eighth linked list again. To see the structure of your HashMap in the same way as in the pictures, you can use Debug mode in IntelliJ IDEA.

Since HashMap implements the Map interface, the methods we had discussed in the previous section can also be used. We will only look at some of them.

Getting an element by a key:

System.out.println(characters.get(1000)); // Cinderella

Method putIfAbsent:

characters.putIfAbsent(2000,"Another Prince"); // nothing happens, because there is already a Prince

System.out.println(characters.get(2000)); // Prince

Using containsKey and containsValue to check if there is a concrete key or a value inside of HashMap:

System.out.println(characters.containsKey(1000)); // true
System.out.println(characters.containsValue("Fairy Godmother")); // false

For the interview

Let's consider some interesting moments. Firstly, you, as a software engineer, can choose what number of buckets will be in HashMap. Remember that the initial capacity is always a power of two, even if you input something like "5".

Map<String, String> map = new HashMap<>(32);

Another important point is collisions. There is always a chance of a situation when two distinct keys will generate the same hash code. This situation is called a collision.

map.put("AaAaAa","First");
map.put("BBBBBB","Second");

If the two keys have the same hash code but the keys are different, the second element will be put right after the first one. In this example, keys AaAaAa and BBBBBB have the same hash code, and Node with the value First will have a link to the Node with the value Second.

Method equals is used for checking if the two keys are different. If they are the same, the second pair will replace the first one. If there are a lot of collisions in a HashMap the complexity for get and put methods becomes O(n). Because of that, class HashMap was upgraded. Since Java 8, it provides the collision resolution mechanism. If linked lists are too long then HashMap changes their structure: all linked lists become balanced trees. The new complexity is O(log(n)).

Finally, it's important to note that HashMap can change the number of buckets. If there are too many pairs in your buckets HashMap will change its size to a bigger one. It will transfer all your previous elements to the new version of itself. That, unfortunately, will take some time. And the more elements there are, the longer transferring will take. The moment when HashMap resizes depends on its load factor. The default load factor is 0.75 and you can define a different one in the constructor: HashMap(int initialCapacity, float loadFactor). When the number of entries in HashMap exceeds a certain threshold calculated as current capacity * load factor, HashMap increases its capacity.

More about comparisons

When you add an object to HashMap, the hash code in the Node structure is calculated only once and can't be changed afterwards. Let's consider a simple example: you want to find an object by a key (in other words, perform a get operation):

First, your own hash code of the key will be compared with the hash code in the Node structure.
Only if the hash codes match, the equals method will be executed for the keys for the final decision.

Here comes a tricky moment. You've created a HashMap and filled it:

Map<BigComplexClass, Integer> map = new HashMap<>();
map.put(bigKey, 123); // bigKey is an instance of BigComplexClass

Then, you update the state of bigKey (for example, a field in it). What will happen if you try map.get(bigKey)? You won't find it in a map. And these are the two reasons why: you either miss it because the updated hash code of bigKey doesn't match the hash code in Node, or, if the hash code hasn't changed, the second step with the equals method will give "false".

When to use it?

The short answer to the question in the title is — if you need to store key-value pairs and want the mutable implementation of the Map interface. HashMap class is often used in practice since it is highly optimized for accessing elements. Also, iteration over HashMap is highly efficient.

And what about the disadvantages? Well, nobody's perfect. There is no way to avoid collisions. The size of HashMap may increase so that it will take some time to transfer all the elements.

At last, remember, that HashMap class makes no guarantees as to the order of the map. It also does not guarantee that the order will remain constant over time.