the generic dev

the generic dev

just another blog

An Attempt to explain Memory Management in Node.JS

15 Nov 2016

In this article, I will try to explain how Node.js garbage collection works, what happens in the background when you write code and how memory is freed up for you.

Please note that this article is written with an assumption that V8 (the default JS runtime in node) is being used.

Memory Management in Node.js Applications

Every application needs memory to work properly. Memory management provides ways to dynamically allocate memory chunks for programs when they request it, and free them when they are no longer needed — so that they can be reused.

Application-level memory management can be manual or automatic. The automatic memory management usually involves a garbage collector.

The following code snippet shows how memory can be allocated in C, using manual memory management:

manual memory management

In manual memory management, it is the responsibility of the developer to free up the unused memory portions. Managing our memory this way can introduce several major bugs to our applications:

  • Memory leaks when the used memory space is never freed up.
  • Wild/dangling pointers appear when an object is deleted, but the pointer is reused. Serious security issues can be introduced when other data structures are overwritten or sensitive information is read.

Luckily for us, Node.js comes with a garbage collector, and we don’t need to manually manage memory allocation.

Garbage Collector to the rescue

At first sight, garbage collection should be dealing with what the name suggests — finding and throwing away the garbage. In reality it is doing exactly the opposite. Garbage Collection is tracking down all the objects that are still used and marks the rest as garbage. Bearing this in mind, we start digging into more details of how the process of automated memory reclamation called ‘Garbage Collection’ is implemented for Node.JS

The way how the GC knows that objects are no longer in use is that no other object has references to them.

Memory before the garbage collection

The following diagram shows how the memory can look like if you have objects with references to each other, and with some objects that have no reference to any objects. These are the objects that can be collected by a garbage collector run.

objects in red are unreachable and hence be collected objects in red are unreachable and hence be collected

Memory after the garbage collection

only live objects stay after GC only live objects stay after GC

Once the garbage collector is run, the objects that are unreachable (marked in grey ) gets deleted, and the memory space is freed up.

  • it prevents wild/dangling pointers bugs,
  • it won’t try to free up space that was already freed up,
  • it will protect you from some types of memory leaks.

Of course, using a garbage collector doesn’t solve all of your problems, and it’s not a silver bullet for memory management. Let’s take a look at things that you should keep in mind!

Things to Keep in Mind When Using a Garbage Collector

  • performance impact — in order to decide what can be freed up, the GC consumes computing power
  • unpredictable stalls — modern GC implementations try to avoid “stop-the-world” collections

Node.js Garbage Collection & Memory Management in Practice

The easiest way of learning is by doing — so I am going to show you what happens in the memory with different code snippets.

The Stack

The stack contains local variables and pointers to objects on the heap or pointers defining the control flow of the application.

In the above example, both a and b will be placed on the stack.

The Heap

The heap is dedicated to store reference type objects, like strings or objects.

The Car object created in the above snippet is placed on the heap.

After this, the memory would look something like this:

Let’s add more cars, and see how our memory would look like!

now along with lm, heap contains sc and m

If the GC would run now, nothing could be freed up, as the root has a reference to every object.

Let’s make it a little bit more interesting, and add some parts to our cars!

What would happen, if we no longer use Mater, but redefine it and assign some other value, like Mater = undefined?

As a result, the original Mater object cannot be reached from the root object, so on the next garbage collector run it will be freed up:

Now as we understand the basics of what’s the expected behaviour of the garbage collector, let’s take a look on how it is implemented in V8!

Garbage Collection Methods

I will cover the internals of V8’s garbage collection methods in detail in a follow up article, But here are the most important things we’ll need to know:

New Space and Old Space

The heap has two main segments, the New Space and the Old Space. The New Space is where new allocations are happening; it is fast to collect garbage here and has a size of ~1–8MBs. Objects living in the New Space are called Young Generation.

The Old Space where the objects that survived the collector in the New Space are promoted into — they are called the Old Generation. Allocation in the Old Space is fast, however collection is expensive so it is infrequently performed .

Young Generation

Usually, ~20% of the Young Generation survives into the Old Generation. Collection in the Old Space will only commence once it is getting exhausted. To do so the V8 engine uses two different collection algorithms.

Scavenge and Mark-Sweep collection

Scavenge collection is fast and runs on the Young Generation, however the slower Mark-Sweep collection runs on the Old Generation.

More details in a follow up post :)

A Real-Life Example — The Meteor Case-Study

In 2013, the creators of Meteor announced their findings about a memory leak they ran into. The problematic code snippet was the following:

Well, the typical way that closures are implemented is that every function object has a link to a dictionary-style object representing its lexical scope. If both functions defined inside _replaceThing_ actually used _originalThing__, it would be important that they both get the same object, even if_ _originalThing_ _gets assigned to over and over, so both functions share the same lexical environment. Now, Chrome’s V8 JavaScript engine is apparently smart enough to keep variables out of the lexical environment if they aren’t used by any closures - from the_ Meteor blog_._