6 minutes read

In this topic, you'll learn about the HotSpot VM garbage collector designed as an improvement of the Serial GC to achieve high throughput and shorter STW pauses. It is the Parallel GC, and it can be activated using the -XX:+UseParallelGC flag. You will learn about its heap structure, the difference between the Serial garbage collector, and the different flags to configure it.

Heap structure

The Parallel GC, also known as a throughput collector is a fully STW collector designed for medium to large-size applications. By default, the initial heap size is 1/64th of the physical memory, while the default maximum heap size is 1/4th of it. It uses the same heap structure as Serial GC, where the young generation is a maximum of 1/3rd of the entire heap size by default. Of course, you can adjust the desired values to these parameters, if default configurations don't satisfy you. But that's another story, and you'll see how to use such parameters later in the topic.

heap

Parallel GC also uses virtual spaces but you won't see them in diagrams here to keep things simple.

This garbage collector performs a similar approach by copying objects from the young generation to the old one or removing dead objects.

heap before garbage collection

heap after garbage collection

The difference is that the Parallel GC can use two or more threads to perform the collection, while the Serial GC uses just one. Let's move to discover how this collector uses parallel threads to reduce garbage collection time.

GC with parallel threads

The main update added to this collector is the possibility of performing garbage collection while using parallel threads. This makes it faster in many cases on machines with two or more CPU cores. On the other hand, it can show worse performance on a single CPU machine because of the parallel execution overhead. Below, you can see an example of garbage collection with the Serial GC:

an example of garbage collection with the Serial GC

This is what it looks like in the case of Parallel GC:

an example of garbage collection with the parallel GC

By the way, before Java 7 update 4 Parallel GC used to have two options. The first one was performing garbage collection with parallel threads only in a young generation by default:

garbage collection with parallel threads only in a young generation by default

The option that enables parallel garbage collection for the old generation too was added in the Java 6 version. It was activated using the -XX:+UseParallelOldGC flag. Only starting with Java 7 update 4, this option is enabled by default.

Controlling the number of threads

This collector uses a special formula to choose the number of threads. If the number of hardware threads is greater than 8, the number of GC threads is approximately 5/8th (for instance, 5 threads for 8 CPU threads) which can drop to 5/16th for several platforms. For machines with less than 8 CPUs, Parallel GC operates with threads equal to the number of cores on your machine. You can also specify the desired number of threads by applying the -XX:ParallelGCThreads=n flag.

However, not everything is that simple, and this approach, too, has a drawback. Each thread performing garbage collection in a young generation reserves a section in a tenured generation to copy objects.

reserving a section in a tenured generation to copying objects

Such an approach leads to the fragmentation of tenured space, and the more threads you use, the higher are chances to have issues due to the fragmentation.

Automatic tuning of metrics

Now that you understand the internals of Parallel GC, it's time to learn about different flags you can use to configure memory management here. Actually, you have probably used them before when working with Serial GC or even newer collectors. Among such flags are:

  • -XX:MinHeapFreeRatio=n: sets the minimum percentage of free space the heap must have after garbage collection.

  • -XX:MaxHeapFreeRatio=n: sets the maximum percentage of free space the heap must have after garbage collection.

  • -XX:NewSize=n: sets the minimum size of the young generation.

  • -XX:MaxNewSize=n: sets the maximum size of the young generation.

  • -XX:NewRatio=n: sets the young/old generations ratio.

  • -XX:SurvivorRatio=n: sets the Survivor/Eden spaces ratio.

These parameters set certain values for specific heap segments. Another approach allows you to specify a desired behavior for your collector and it's smart enough to adjust the abovementioned or other parameters to ensure that behavior. These parameters (goals) are:

  • Maximum Garbage Collection Pause Time. This parameter shows the maximum stop-the-world pause time and can be set by applying the -XX:MaxGCPauseMillis=n flag. By default, the application doesn't define this parameter, but if you do, the application configures different memory segment sizes in order not to exceed the specified limit.

  • Throughput. With this parameter, you define the relationship between the application running and garbage collection time which is 99/1 by default. You can change its value with the -XX:GCTimeRatio=n flag which calculates the desired value using the 1 / (1 + <n>) formula. According to the official documentation, -XX:GCTimeRatio=19 sets this parameter to 1/20th or 1 / (1 + n) * 100% = 5% of the total time for garbage collection. Remember that if garbage collection takes more than 98% of the total time and releases less than 2% of the heap, then you will get an OutOfMemoryError as a result.

  • Footprint. This parameter defines the maximum heap size. It can be set by the -Xmx<n> flag and can be reduced if other goals are met.

These three have a priority order, where the maximum GC pause time has the highest priority. When the collector meets these goals, it starts working on throughput and in the end starts improving the footprint.

Exploring GC logs

There is one last thing for you to know before you finish this topic. Suppose you need to take a look at the logs of this collector. For that, let's use this code:

    public static void main(String[] args) throws InterruptedException {
        for (int i = 1; i < 50_000_000; i++) {
            long[] arr = new long[1_000_000];

            for (int j = 0; j < arr.length; j++) {
                arr[j] = j;
            }

            arr = null;
            System.gc();
        }
    }

Now, let's apply the following command to the compiled code:

java -XX:+UseParallelGC -XX:ParallelGCThreads=4 -XX:MaxGCPauseMillis=150 -Xlog:gc* Main

Below, you see the info of two garbage collection rounds cut from the whole log message: young and full.

[118.083s][info][gc,start       ] GC(10880) Pause Young (System.gc())
[118.084s][info][gc,heap        ] GC(10880) PSYoungGen: 11140K(38400K)->128K(38400K) Eden: 11140K(33280K)->0K(33280K) From: 0K(5120K)->128K(5120K)
[118.084s][info][gc,heap        ] GC(10880) ParOldGen: 3625K(87552K)->3625K(87552K)
[118.084s][info][gc,metaspace   ] GC(10880) Metaspace: 8027K(8256K)->8027K(8256K) NonClass: 7170K(7296K)->7170K(7296K) Class: 856K(960K)->856K(960K)
[118.084s][info][gc             ] GC(10880) Pause Young (System.gc()) 14M->3M(123M) 1.382ms
[118.085s][info][gc,cpu         ] GC(10880) User=0.00s Sys=0.00s Real=0.00s
[118.085s][info][gc,start       ] GC(10881) Pause Full (System.gc())
[118.085s][info][gc,phases,start] GC(10881) Marking Phase
[118.092s][info][gc,phases      ] GC(10881) Marking Phase 6.410ms
[118.092s][info][gc,phases,start] GC(10881) Summary Phase
[118.092s][info][gc,phases      ] GC(10881) Summary Phase 0.260ms
[118.093s][info][gc,phases,start] GC(10881) Adjust Roots
[118.096s][info][gc,phases      ] GC(10881) Adjust Roots 3.213ms
[118.096s][info][gc,phases,start] GC(10881) Compaction Phase
[118.101s][info][gc,phases      ] GC(10881) Compaction Phase 4.602ms
[118.101s][info][gc,phases,start] GC(10881) Post Compact
[118.102s][info][gc,phases      ] GC(10881) Post Compact 0.578ms
[118.102s][info][gc,heap        ] GC(10881) PSYoungGen: 128K(38400K)->0K(38400K) Eden: 0K(33280K)->0K(33280K) From: 128K(5120K)->0K(5120K)
[118.103s][info][gc,heap        ] GC(10881) ParOldGen: 3625K(87552K)->3624K(87552K)
[118.103s][info][gc,metaspace   ] GC(10881) Metaspace: 8027K(8256K)->8027K(8256K) NonClass: 7170K(7296K)->7170K(7296K) Class: 856K(960K)->856K(960K)
[118.103s][info][gc             ] GC(10881) Pause Full (System.gc()) 3M->3M(123M) 18.015ms
[118.103s][info][gc,cpu         ] GC(10881) User=0.05s Sys=0.00s Real=0.02s

You've likely met such a GC log if you tried to explore logs when learning about the Serial GC. These lines have tags to specify what the message contains and the main info concerning each phase.

Conclusion

You've covered one more topic concerning garbage collection! You learned about Parallel GC, its heap structure, and how it performs garbage collection compared to Serial GC using parallel threads. Another important takeaway is understanding different memory management metrics and how to configure them. However, don't forget that garbage collection tuning requires a serious investigation, otherwise, it can cause unexpected behavior in your application.

32 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo