Computer scienceBackendNode.jsApplication DevelopmentComposition of Node.js

Garbage collector

8 minutes read

In server-side JavaScript development, Node.js has emerged as a versatile tool. One of the key challenges in managing memory within a runtime environment like Node.js is dealing with dynamically allocated memory that is no longer in use. This is where the concept of garbage collection becomes crucial. This topic explores the fundamentals of garbage collection in Node.js, shedding light on how it works, its impact on application performance, and the strategies employed by the V8 engine to manage memory.

Introduction to memory management

Node.js employs Google's V8 engine, responsible for executing programs written in JavaScript. V8 compiles JavaScript code into native code and subsequently executes it. Throughout execution, it oversees the allocation and deallocation of memory as required by our program. Consequently, any discourse on memory management in Node.js invariably involves discussions about V8.

Efficient memory management is imperative, as continuous object creation can eventually lead to the exhaustion of allocated heap memory, resulting in program halting due to Memory Leaks. Efficient memory management involves eliminating objects that won't be needed later in the program's execution, thus freeing up space in the heap memory.

Various programming languages adopt different strategies for memory management. Languages like C and C++ implement manual memory management, wherein programmers must specify when to allocate and free memory, providing more control but posing challenges. Conversely, languages like Java and the Node.js runtime environment utilize automated memory management. This entails the allocation and deallocation being handled by the execution context, employing GC (Garbage Collector) to free up memory as needed, a concept we'll explore in detail.

Memory allocation scheme

In most running programs, stack and heap memory are utilized for managing execution. The V8 engine represents a running program through a space allocated in memory known as the Resident Set, akin to the Java Virtual Machine. This memory is segmented into three parts:

  • Code: Contains the actual code executed line by line.
  • Stack: Encompasses primitive type variables and pointers pointing to the heap.
  • Heap: Reserved space for storing reference types such as Objects, Strings, and Closures.

Heap memory in Node.js

Basics of garbage collector

Garbage collection is the process of automatically identifying and reclaiming memory that is no longer in use by the program. In simpler terms, it helps in cleaning up memory occupied by objects that are no longer accessible or needed, preventing memory leaks and optimizing overall performance.

Node.js uses the V8 JavaScript engine, which incorporates a generational garbage collector. The V8 engine employs two main types of garbage collection algorithms: the Young Generation and the Old Generation which we'll touch upon later in this topic.

A Garbage Collector has several crucial tasks to perform periodically, including:

  • Identifying Live/Dead Objects (Marking)
  • Reusing memory occupied by dead objects (Sweeping)
  • Compacting unused/free memory (Compacting)

These tasks are executed sequentially or can be executed arbitrarily based on need. The conventional approach involves pausing Javascript execution to perform these tasks on the main thread.

Before delving into Marking, Sweeping, and Compaction, let's first explore the different classes into which objects are categorized.

Generational layout

Objects fall into three categories:

  • Nursery: The initial placement of objects after creation.
  • Intermediate: Objects transferred from nursery to intermediate after surviving one Garbage Collection cycle.
  • Old: Objects moved to this generation if they survive another GC cycle in the intermediate generation.

The Generational Hypothesis in GC posits that most objects die young, meaning they are mostly deleted in the young generation shortly after creation. You can learn more about this from v8 blogs.

Generational layout

Now that we understand the two types of generations, let's delve into the algorithm and its application. The algorithm can be applied to both generations or just the young generation, leading to two types of garbage collection:

  • Minor GC (Scavenger Garbage Collection): Takes place in the young generation, removing most objects that die young, ensuring quick and efficient memory freeing.
  • Major GC (Full Mark Compact Garbage Collection): Frees memory from the entire heap, involving marking, sweeping, and compaction.

Types of garbage collection

Major Garbage Collector (Full Mark-Compact)

This collector retrieves garbage from the entire heap memory and consists of three stages:

  1. Marking: The crucial first step in the Garbage Collection process, is identifying objects that can be collected. This is achieved by starting with a set of known pointers, the root set, which includes objects from the Global execution context. The process recursively follows these pointers, marking them as reachable until the end is reached. Once completed, the marking step concludes.
  2. Sweeping: In this process, the space left by dead objects is added to a data structure maintained by the GC, known as the free-list. After marking is completed, the contiguous space left by unreachable objects is added to the appropriate free-list, organized by the size of the memory block for quick lookup. When allocating memory for new objects in the future, the free list helps find the suitable chunk of memory.
  3. Compaction: This step is occasionally skipped because it involves copying living objects into a new page, enabling the remaining space to be utilized as a single large chunk of memory for reuse. Compaction is typically required for highly fragmented pages, but it is skipped as copying living objects is a significant task, and sweeping is usually more effective.

Major GC is used when a larger amount of space needs to be freed, but it is less frequently used due to the generational hypothesis suggesting that most objects die young.

Minor Garbage Collector (Scavenger)

This collector focuses on collecting garbage from the young generation. In Scavenger, each living object is migrated to a new page using a semi-space approach. This method divides the space, leaving half of it empty (to-space) where objects will be copied, and the source space is known as from-space. The Garbage Collection process involves three steps:

  1. Marking: Similar to the marking step in Major GC, where objects are marked using an initial set of pointers (root set) and additional roots known as old-to-new references. These references are pointers in the old space that refer to objects in the young generation. Using these references along with global and local variables, every reference in the young generation can be determined without the need to trace through the entire old generation.
  2. Evacuating: Surviving objects move from their respective pages (from-space) to contiguous memory allocation on another page (new-space), addressing the issue of fragmented memory. After transferring all surviving objects, one GC cycle is completed, and the spaces switch, with from-space becoming the new-space and vice versa. Rapid space depletion during this process may lead to objects surviving two cycles being evacuated to the old generation instead of moving to to-space.
  3. Pointer Updation: The final step involves updating pointers referencing the original objects that have been moved during evacuation. This is achieved using a forwarding address, where the older address is sent as a forwarding address every time an object is evacuated and moved to a new address. This forwarding address is then used to update the original pointer to the new address.

Usage

One practical consideration involves managing object lifetimes effectively. Developers should be mindful of when and how objects are created and become unreachable, as this knowledge can minimize the impact of garbage collection on application responsiveness. In scenarios where frequent and unnecessary object creation could lead to increased memory churn, developers can adopt practices that mitigate these effects.

Optimizations and best practices tailored to the garbage collection process can further enhance application performance. Strategic techniques, such as reducing the frequency of garbage collection cycles and optimizing memory usage, contribute to a more responsive application. Developers can choose data structures and design patterns that align with the garbage collector's workings.

In essence, practical considerations for garbage collection in Node.js involve a holistic approach to memory management. While developers do not need to directly interact with the garbage collector, aligning code practices with garbage collection characteristics allows them to create more responsive and efficient Node.js applications.

Conclusion

Memory is a crucial consideration during program execution, given its limited availability. Garbage Collectors play a vital role in making memory available and preventing memory leaks. There are two types of garbage collectors: the minor GC, also known as Scavenger, operating in the younger generation, and the major GC, also known as Full Compact, clearing unused memory from both generations. Although Garbage Collectors operate in automated mode, their significance becomes apparent when understanding their roles. With this understanding, we can appreciate the engineering behind them.

3 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo