Tuesday, December 4, 2012

.NET Memory Management Primer


I thought of compiling all that we learned regarding the .NET memory management while working in one of those smart client applications that we were developing for sometime - a suspected memory leak/performance issue. The post is a result of a combination of discussions, study of web resources including MSDN and running profilers like the Red Gate Ants Memory Profiler. You might be knowing some or all of the concepts, but my idea is to keep it as a simple primer where you go and refresh your knowledge and if possible learn something new

3 .NET App Memory Problems

With a managed environment like .NET, you have very little to do, good or bad, in terms of memory management because the lion's share of responsibility is taken by the Garbage Collector (GC), a component of the Common Language Runtime (CLR). Initial goal while designing GC is to completely abstract memory management from the developers. We must congratulate the GC team @ Microsoft in that they almost did that, but for some issues. I think they have done already what's humanly possible and further optimization of GC may not be possible

 It looks like there are only 3 memory related problems you may encounter in your .NET application

  1. Large object heap memory fragmentation
  1. Memory leak in the managed code
  1. Memory leak in the unmanaged code

Before going deeply into the problems, we need to know how .NET framework abstracts memory management from your application

How memory management works in .NET?

Before allocating objects onto the memory, CLR does one important determination- if the object is a large object or a small object. Any objects that is more than 85KB is a large object. If it's a large object then it's allocated in the large object heap. If it's a small object, then it's allocated in the generations based small object heap. The generations in the small object heap are

  • Generation 0 (holds the youngest objects)
  • Generation 1 (holds the older objects)
  • Generation 2 (holds the oldest objects)

The most ideal scenario is your app allocates objects to Gen 0 and de-references (abandoning) it as soon the app is done with using that object.

Large Object Heap

As already told, large objects are allocated in the large object heap by the CLR. Important differences between the large and small object heaps are

  • Unlike the small object heap, there are no generations here
  • Unlike the small object heap, 'compacting' doesn't occur during the garbage collection because it's a very costly process with large objects

What is compacting?
You can imagine a heap as something like a container in which objects are allocated one-on-the-top of the other. If a in-between object is collected by the GC, then the space held by the object is not compacted. For example assume obj A, B and C are allocated (A at the bottom, B in-between and C at the top). Assume there is some unused space on top of C. If B is de-allocated when the GC occurs, then the space held by B will remain as unused, so there will be a gap between A and C. Similar gaps between objects is called large object heap fragmentation and it's the most severe .NET memory problem. Since there is no compacting, next allocation of memory cannot be done effectively and efficiently. Some limited fragmentation may be acceptable, but if there is more fragmentation then there could be out-of-memory exceptions even when there is space in the heap, because they cannot be used due to excessive fragmentation

How to avoid large object heap fragmentation?
  • Better architecture
  • Avoiding large objects as far as possible
  • A kind of LIFO (Last-in-first-out) strategy while allocating/de-referencing objects
  • Forceful app or sub-app closure (not a good one though)

Small Object Heap

Small objects are allocated in the small object heap generations. An object is first allocated to Gen 0 by default. When Gen 0 is full, then GC is triggered. When the GC process is triggered the following happens

  • Objects that have no references are salvaged (means collected) and the memory is freed.
  • Objects that have app references are considered 'live' by the GC and promoted to Gen 1

Hence Gen 0 will be totally free, ready for fresh allocation at the end of the GC process. Same way GC process occurs for Gen 1 and Gen 2 also. Here is what the memory leak occurs

How memory leak occurs?
If an object really not needed by the app has some reference to it, then the object will be considered 'live' by the GC and promoted to Gen 1 (or for that matter Gen 2 as well) by mistake. This is called memory leak

GC Root

GC roots are special areas in memory that are always accessible by the application (like a reference to a static variable). CLR determines if an object is live or not by finding out if any of GC root has direct or indirect reference to the objects allocated by the application. If an object is not referred by any of the GC roots then that memory is identified as 'not reachable' by the application and is a garbage. Hence the GC can collect those objects when the GC process happens the next time

What a .NET developer is supposed to do to avoid memory leak?

The answer is Nothing. CLR is intelligent enough to identify if an object is live or not. But a developer may do blunders like allocating memory in a event handler like 'Timer_Elapsed' events in which case, the object will always be considered 'live' by the CLR and will be collected only when the application is closed.

But you can make your code 'GC friendly' by assigning the objects to null when you know for certain that you are done with using the object. That way GC will be efficient and hence your app too.

Other Learnings

  • Avoid finalizers
  • Avoid forceful GC process by calling GC.Collect(). It should be used only under extra-ordinary situations. Also GC.Collect() without any parameter will trigger the complete GC process from all the generations and large object heap too
  • Avoid large objects
  • Never allocate memory in the event handlers
  • Make your code GC friendly by assigning objects to null when you are done with using the object. This is not a must, but improves the efficiency of the whole system

No comments:

Post a Comment