Optimizing Jupiter garbage collector

The first version of the Jupiter’s garbage collector was very simple. When a new object needed to be allocated, I just called the new operator, and when it was time to be collected delete. The gc phase triggered when a fixed amount of objects were allocated. As you can imagine, this have a terrible performance.

Easy optimizations

First I added a pool of objects that is preallocated at the start of the virtual machine with a relatively large number (at the moment 16000). This way most of the time I don’t need to ask for new memory, I just get it from the pool. Also short lived objects are returned quickly to the pool.

This alone increased the performance by a ~130% in some cases where a lot of garbage is generated.

Also, I had a std::list to keep track of the allocated objects, but std::list has bery bad performance. Initialy I used this container because in the sweep phase I needed to remove objects from this list, and removing an element in the middle of a std::vector is very non-performant. I solved this using two std::vectors alternatively, a from vector and a to vector. When an object is created is pushed to from. In the sweep phase I iterate the from vector, if the objects is marked ( it survives ) is pushed to the to vector, else the object is released into the pool. When the sweep phase finish, I clear the from vector and swap the from and to (now new objects are pushed to the to vector). In the next collection cycle the roles of from and to are inverted.

This increased the performance by another ~130%

Generational GC

The mark phase is the most critical (and slow) phase of the collection. To keep it simple for the moment I just used two generations: young and tenured that are signaled by a Boolean in all objects. When an object survives a collection cycle is marked as tenured.

Having objects separated in two groups (generations) allows me to divide the mark phase in two: A minor collection and a mayor collection. In the minor collection I just mark & sweep young objects, while in the mayor collection I mark & sweep all. Right now I do a mayor collection after 10 minor collections are executed.

This the increased the performance a bit ( between 50% - 70% ). It works specially well when big arrays are used. The array usually survives the first collection cycle, so in the next 10 collection cycles the array and all of its elements are skip.

You can check more about advantages of generational GC’s here: Generational GC in Wikipedia

Future work

Right now the memory for the objects is very fragmented, although I suspect the standard allocation library does a good job allocating a lot of similar objects (probably with memory pools), I’d like to experiment with copying GC strategies and allocating objects from a big continuous chunk of memory.

Thanks for reading!