You are on page 1of 7

How to write friendlier code for the Garbage Collector and

to gain performance boost


By Cohen Shwartz Oren

Learn how to create objects in a way that diminishes the GC performance cost.

Introduction

Memory management is a very important issue, even, now, in the world of


managed programming languages where the .NET Framework encapsulates it
entirely. In my humble opinion, it is mandatory for developers to understand the
memory management process in a managed environment. The old school of
developers knew much more about memory management than the novice ones
mainly because they had to. The .NET Framework manages the memory for us. It
is great! It saves us the headaches generated by problems like leaks and memory
overriding. The problem starts when a developer starts to get lazy and forgets
about the issue all together. Some developers, mistakenly, think that they can not
interfere with the .NET Framework memory management process. But the truth is
that the current .NET Framework still needs the developer's help to perform
better. The memory management process in .NET is well transparent and
encapsulated. It almost looks like there is nothing that the developer can do to
harm or improve it, but in fact it can perform well if the developer knows how to
use it correctly.

Disclaimer

I am not an official expert of the Garbage Collection mechanism. The origin of the
stuff I write here are books, articles, and filed experience. Please also pay
attention to the fact that no one knows, except for the guys in the CLR group in
the Microsoft Corporation, how really the GC works. Even the official released
paper by Microsoft hides some of the information, mainly due to copyright
reasons.

Why writing the article?

I guessed it would have been nice to share, with the CodeProject community, the
knowledge about working correctly with memory allocation in managed code. At
least, I hope that the article will trigger the reader to get interested in this
important topic. This article's purpose is to shed some light about Managed
memory management. The reader will find tips about how to create and use
memory better, explanations about the main differences between managed and
unmanaged memory management, and finally, a glance at the future of memory
management in the .NET Framework.

Who should read this article?

Anyone who writes managed code, in any of the .NET languages, and is keen
about writing better performing code. I expect the user to have some background
knowledge about the GC.

What you will not find in the article


There are many good articles about the .NET GC, that describe in details the
algorithms and mechanisms of the Managed memory management (I have added
some good links at the end of the article). This article emphasizes only on the
performance cost of allocation and how to diminish it.

Garbage Collector assumptions

It is important to read the following section before you continue. The reason for
this recommendation is that the rest of the article leans on this information.
Designing a memory management mechanism requires a set of assumptions
about memory usage. These assumptions eventually translate to a set of rules.

Here are the published rules which the .NET GC complies to:

• Objects are allocated contiguously.


• The heap is divided in to several parts called generations.
• References hence, used objects are being moved from one generation
level to a higher generation level during the collection operation.
• Objects in the lowest generation level are the youngest.
• Recently created objects tend to have a short life.
• The objects in the highest generation level are the oldest and are also
known as the survivors.
• The older an object gets, its necessity tends to be higher and it is assumed
to live longer.
• The age of objects in each generation is pretty much the same.
• There are no gaps between objects (due to compacting).
• The order of the objects in the memory is corresponding to the order of
their creation.
• The garbage collection engine determines the best time to perform a
collection. GC runs as a response to allocation, and it starts collecting only
if there is not enough space.
• Collection of certain generations is done only in cases where the relevant
heap portion does not have enough free space.
• The GC is turned to collect the next generation level only if there is still
lack of memory after the collecting of the previous generation.

The main differences between managed and unmanaged


code in terms of memory management and performance

This section is important for the experienced developer who is used to developing
in unmanaged languages like C/C++ and, god forbid, in Assembly. The reason for
that is that the thumb rules for creating objects in the heap in managed and
unmanaged environment are simply inverted. In an unmanaged environment, the
cost of allocation is negligible as long as the memory is free (without or with few
fragmentations). When the memory is fragmented, searching for free space is
required and it is very costly.

In an unmanaged environment, there is no direct connection between the amount


of memory you are trying to allocate to the memory condition. Therefore, in case
the memory is heavily fragmented, then allocating 1K or 1MB may take
approximately the same time. In a managed environment, the cost is the size of
the allocation. In the normal cases, where there is enough available memory, the
managed memory management allocates space in a sequential manner.
Therefore, there is no need to spend time on searching for free space. The
performance cost is when you are running out of memory. Then the GC is
activated and it starts to perform operations of clearing, compacting, and
restructuring the memory, which are extremely costly.

The following affect how hard the GC will work:

• How many objects you allocate.


• The size of the objects.
• The lifetime of the objects.

How to write friendlier code for the GC

• Implement a destructor only when needed

An object with a destructor is marked in the GC as a Finalizable object.


There is a real performance cost in terms of the GC operation for this kind
of objects. Finalizable objects take longer to allocate and to reclaim.

o You should use a destructor mainly if you use PInvoke.


o Gather the unmanaged resources like handles in one object.
o Avoid using Finalizable objects in big arrays.
o Avoid referencing to such objects from regular objects else it will
make them live longer.
• Avoid calling the GC.Collect method

Calling the method without parameters cause the collecting of all


generations. It is the same like calling the method with GC.MaxGeneration
as the parameter.

It is a good thumb rule to count till 10 before calling this method. In a


second thought, counting to 1000 is even better. Pay attention to the fact
that during the collecting operation, the process threads are practically
suspended. Do you need more information besides that?

• Prefer creating small amounts of memory for long lived objects

Higher generations are collected rarely for optimization purposes. Large


objects will cause the collection procedure to run quickly and hurt the
performance.

• Allocate only the exact amount of required memory

The smaller the object is, the less we pay in terms of performance because
the reclaiming of memory space will be faster and the process thread's
suspension time will be shorter.

• Avoid references to temporary objects which might mistakenly


survive

If you allocate small temporary objects all the time and these objects live
for a short period of time and then die, it is fine. You just need to make
sure that these objects will not be referenced later. When the time has
come and the system runs out of memory, the GC will smoothly clean
them and avoid the need of moving them to a higher generation and
running the compacting operation which is extremely costly.

• Avoid using pools

Pools are an obsolete feature in the managed environment. We have used


pools in the past in, an the unmanaged environment, in order to avoid
allocation during the process life time, reusing objects, and to assure that
memory will be available for the process as long the execution time by
allocating it up front. In the managed environment, the allocation is very
fast (when there is available space) and it is better to allocate only when
required.

• Avoid middle range object allocation

As stated above, short lived objects are good for performance because we
don't 'pay' the compaction price. A long lived object is fine only if it is
really needed throughout the execution time. Long lived objects will
survive during GC rounds. They are marked as survivors and the GC skips
reclaiming them and deals with the temporary - short lived – objects
instead. By that, we improve the performance.

• Avoid heavy objects

Know that heavy objects (above 20 MB) never get compacted. They are
allocated in a special space in the heap. The reason for this is that moving
them in memory causes too much load for the CPU.

Finalizeable object cost

First of all, you should know that there is no real destructor in .NET languages. If
there were destructors then it would have contradicted the concept of having
memory management mechanisms like the .NET GC. The CLR team actually
wanted to avoid the common destructor syntax altogether. Initially, the solution
for clearing unmanaged code was to implement the Finalize method but
because it required developers to add some additional code to the method,
Microsoft decided to write it by itself. The C# compiler, for example, injects the
developer destructor implementation into the Finalize method. So if you look at
your code via the ILDASM tool, you will find out that your destructor is not there.

The GC handles the Finalize method differently. It creates a pointer to the


object and places it in the Finalization queue. The GC, during the collection
process, checks for each candidate object for deletion, if there is a pointer to it in
the Finalization queue. If there is, it removes it from the queue and places it in
the Reachable queue.

Later on, the CLR will run over this queue and use the pointer to call the
Finalize method.

Object aliveness and their impact on the GC


performance

The life of an object is the period of time in which the object survives in the
managed heap. A good thumb rule is that the age of the object needs to be
corresponding to its necessity. This means that it is completely OK to have a long
lived elderly object but only if it is really required for all that time. For example,
when you use MS Paint to draw a diagram, the previous drawn shapes are
supposed to be kept in memory for a long time. The older objects will reach the
highest generation. The GC checks the higher generation in significantly lower
rate. The reason for that is quite logically, if the object survived a long time then
there is a good chance that it will be required by the process in future. This rule
marks the elderly objects as bad candidates for reclaiming. This is also why it is a
bad idea to call GC.Collect because it causes the GC to run over all the
generations including the higher ones, and that is a waste of time mainly because
the process threads are suspended during the operation.

To summarize, let's put it like this. Necessarily, long lived object do not harm the
performance in terms of memory management, with one exception - those
objects are not 'heavy' in term of bytes. If they do, then there could be problems
because the highest generation will run out of space and the GC will have to
check them and try to reclaim the memory space. Short lived, temporary objects
are also fine. Young objects that get in and out from generation 0 are wonderful.
Since the allocation of objects is negligible, there is no significant hurt to
performance in this behavior. In fact, this is the optimal case for the GC. Clearing
generation 0 is also very fast and doesn’t have significant impact to performance.

Middle range objects are the problem. Who are they? Why and how do they mess
up with the performance?

Middle range objects live long enough to be moved from generation 0 and might
even get to generation 2 (the highest in the current .NET Framework version).
These objects fall in between the definition of short lived and long lived objects.
The problem is that, right after they arrive in the higher generation, they are no
longer needed. Since the GC checks the higher generation in a lower frequency
they might stay there for a long period of time, and since there is really no need
to check and reclaim space in higher generations, we will pay the performance
cost due to heavy compaction operations.

To illustrate this cost, take a look at the following example:

Object A is allocated and placed in Generation 0. After a certain period of time,


Generation 0 is garbage collected. Object A is still required at this stage and so it
moves to Generation 1. Until now, there are no problems with this scenario. After
a small fraction of time (after Generation 0 is garbage collected), the object
becomes useless. It will keep staying in Generation 1 till its collection which does
not happen often. Now, let's say that Generation 0 collection is activated and
there is still not enough memory space. In this case, the GC also collects
Generation 1. Object A which is no longer required will be deleted and Generation
1 space will be compacted.

In some ways, the middle-range object problem is a side effect of the successful
optimization of the generations mechanism. We win by avoiding the frequent
checking of elderly objects but lose in the case of middle-range objects. These
objects arrive at the long term area only to die there. They are mistakenly being
treated as elderly objects while they not really behave as such.

What can we do to diminish the problem of middle range objects?

Cache it!
If you have objects that go through the following cycle of: creation, lives for a few
minutes, then dies, then you should consider caching them. Instead of creating
them (enters Generation 0) and then stop needing them after a short period of
time (when they are in Generation 1 or 2), then creating new ones and doing the
same over and over again, it will be a good idea to cache them and instead of
disposing them and creating new ones, just change a state that indicates that the
object is available.

For example:

Creating an instance of class A (Generation 0):

A obj = new A();

After a few minutes, the instance is positioned in Generation 1. Then we don’t


need it any more:

Obj = null;

The object will stay in Generation 1 although it is dead. After a few seconds, we
create a new instance of class A and etc. To improve this code, we can cache the
first instance, and in the second snippet, instead of assigning it to null, just
change the object state and mark its availability. By that, we return it to the
cache instead of disposing it. In the long run, it will diminish the middle range
problem.

A glance at the future

There is a good chance that the CLR and GC performance will improve in the
future. The jump from Assembly language to C and C++ brought with it
performance issues in favor of Assembly. Nowadays the performance difference
between these two languages on Windows is negligible. It looks like the fox in
Redmond is working hard to close the performance gap between managed and
unmanaged code in terms of performance.

I am working with Microsoft technologies for the past seven years. From my
knowledge, meeting the deadlines and the time to market considerations are in
the top priority for the Microsoft business. This kind of business philosophy,
without criticizing it, causes, sometimes, the delivery of less features and
functionalities which is then set to be deployed in the next delivery wave, like in
the next .NET Framework version.

It is reasonable to say here that Microsoft wants to stay the master of her
domain, and works hard to optimize the .NET Framework. For instance, right now
the .NET Collection library is better in terms of performance than STL, ASP.NET is
better than ASP, and ADO.NET brought significant improvements to the previous
ADO legacy that leans on COM.

Taking memory management away from developers helps not only in terms of
avoiding memory leaks but also in other fields. For example, the use of pointers
to manage memory causes exceptions or mishaps when memory is overridden or
does not exist anymore. The experts in Microsoft will improve this management
and test it for us programmers. The CLR and GC can look for trends and
fragmentations, check the memory state etc. In terms of dynamic tuning, the GC
will need to be more dynamic in terms of decreasing and increasing the size of
generations, changing the number of generations, and adapting different types of
algorithm to a certain process' activities, and even assigning different algorithms
per generation. As we have seen, the Generations algorithm has drawbacks.

Currently, the CLR has two modes for the Garbage Collector, one for workstations
and one for servers. The GC in the server mode makes use of higher resources of
RAM and CPUs, and runs faster. Assigning bigger heap space to processes and
using the CPU more sensibly will reduce performance hits. As such resources
become cheaper and cheaper, the CLR can make the most of it.

Links

• Garbage Collection Part 1: Automatic Memory Management in the


Microsoft .NET Framework

Author: Jeffrey Richter

Publishing date: From the November 2000 issue of MSDN Magazine.

• Garbage Collection Part 2: Automatic Memory Management in the


Microsoft .NET Framework

Author: Jeffrey Richter

Publishing date: From the December 2000 issue of MSDN Magazine.

• The Programming Life: Experiences in Development, Memory Management


in .NET

Author: George P Alexander

Publishing date: Posted 12/6/2005

• Marissa's Guide to the .NET Garbage Collector

Author: John Gomez

Publishing date: Aug. 11, 2003

• Garbage Collection in .NET

Author: Amit Kukreja

Publishing date: CodeProject, 24 Apr 2002