9 minutes read

In the previous topic, we mentioned that garbage collection can be performed in fixed time intervals or when there is no heap memory left. The garbage collector is a rather complex mechanism that consumes a lot of resources. If garbage collection starts as soon as any unused object appears, this process will consume more resources than just storing the object would. Therefore, the JVM executes it only when garbage collection is considered necessary.
We will start by discussing a few ways of finding unnecessary objects and learn what object generation means and how this mechanism classifies objects. We will also tell you about the pitfalls of automatic garbage collection and about the tools developers can use to monitor the execution of a program.

In this topic, by the JVM we mean the HotSpot JVM, which is used by the open-source version of Java Standard.

How to find unused objects

To remove garbage from memory, the GC needs to know which objects are dead. The idea is quite simple and clear, but in fact, it is a rather complex process based on various algorithms. In this section, we will tell you about two approaches to locating unused objects:

  • Reference counting. The idea of this approach is that each object is assigned a field showing the number of references pointing to it. Adding a reference increases its value by 1, and removing a reference decreases its value by 1.

    JVM garbage collector reference counting


    Reference counting allows the collector to delete an object as soon as the counter value is zero, although this may not always be the right decision. This feature is one of the advantages of reference counting, but it also has some disadvantages. The application consumes memory to store the field for the reference counter, and its performance is reduced due to operations to increase or decrease the field value. Another important disadvantage is difficulties in finding circular references: when two objects refer only to each other, they may be invisible to the collector because their reference counter will never become 0.

  • Tracing. This approach is more common than reference counting; in fact, JVM uses only tracing algorithms for garbage collectors. The tracing algorithm finds referenced objects, marks them, and writes off the rest as garbage. The search starts with objects known as GC Roots and builds a chain of related objects. Typical GC Root objects are local variables and method parameters, threads, static variables. In the image below, the blue circles represent chains of live objects, and the white ones represent dead objects, where the connection between a and b is an example of a circular reference.

    JVM dead object garbage collection


    An important advantage over the previous method is the possibility to find circular references but in this case, the garbage collector needs to wait until the algorithm finds all livе objects before it starts removing dead ones.

Simply put, reference counting and tracing are two opposites: the first tracks dead objects and the second tracks live ones. All algorithms performing garbage collection use these two approaches. Some prefer one of them, and some others use both.

Memory cleanup: generational hypothesis

The generational mechanism is a common strategy for garbage collectors. It works by dividing objects into groups (generations), and if cleaning one group frees up enough memory, the collector does not clean the others, saving time and resources. As for the JVM, it has both generational and non-generational garbage collectors. For instance, G1 (Garbage First) uses a generational approach, and ZGC does not.

This approach divides objects into generations, depending on how many garbage collections they have survived. Memory cleanup starts with the youngest generation, where all new objects are. The reason for that is that the experience and statistics collected over the years have formed the generational hypothesis, which states that most objects die young. Therefore, GC must start garbage collection from the place where the youngest objects are stored. Objects surviving the garbage collection move to the next generation. In the second generation, the garbage collection is performed only when cleaning the first one does not free up enough memory. That is, the cleaning of each generation is carried out when the cleaning of all previous ones does not free up enough memory, and the objects that survived the cleaning move to the next generation. This way, passing through all generations, objects can reach the last one and remain there if the garbage collector does not delete them.

JVM object lifetime distribution

In addition to dividing objects into generations, there is another approach to organizing the memory management process – the division of memory into regions, where old and new objects can be stored in the same region. Java has GCs that use only this approach. It is also possible to use both of them in the same collector.

Automated garbage collection pitfalls

While automatic garbage collection makes programming easier and speeds up application development, it also has its downsides. Among them are:

  • Resource consumption (memory, CPU). Automatic garbage collection consumes a lot of CPU and memory resources. This is the main reason why programming languages with manual memory management such as C++ work faster. In C++ developers control memory management themselves, and the application does not need to spend resources on it.

  • Latency. All implementations of JVM garbage collectors have the so-called stop-the-world mechanism requiring applications to pause. It is a period when new objects are not created and garbage collection is run. Тhe longer this pause, the slower the application will run.

  • Memory fragmentation. After dead objects removal, the memory areas where they were located remain unused. Therefore, after garbage collection, objects are overwritten next to each other to remove the extra space. Of course, this process also affects the performance of the application.

Java developers have a set of tools to monitor the execution and performance of the program and get statistics. Some examples of such tools are jstat or Java Mission Control, shipped with Java, or third-party applications like VisualVM and JProfiler. Study these and other tools carefully. They will give you valuable benefits.

Conclusion

Although it is assumed that Java carries out garbage collection automatically, this is not quite true. First of all, the Java platform uses several GCs and each of them has its own ways of performing memory management. Developers can choose the one that suits their needs best. Besides that, there are options to make various settings to configure GC's memory management process and control it.

180 learners liked this piece of theory. 3 didn't like it. What about you?
Report a typo