You are on page 1of 21

Inside .

NET
Introduction
The .NET architecture addresses an important need - language interoperability. As more than one
language can be converted to the same intermediate language (MSIL - Microsoft Intermediate
Language), the code compiled through two different high-level languages can interact at an
intermediate code level. This gives you the ability to work in different languages depending on your
requirements, and still benefit from the unified platform to which the code for any .NET compliant
language is compiled. To understand the internals of .NET, it is essential to keep this core idea in mind.
Also, understanding the internal details of .NET is essential to exploit the advanced features like
runtime code generation and reflection. Visual Studio .NET is shipped with many tools, such as an
assembly linker, that are best understood and used only when the programmer has a good
understanding of the internals of .NET. When you understand how the runtime executes, the structure
of the executable files that your compiler generates, the garbage collection process, etc, you are
naturally in a better position to program and make the best use of .NET. The objective of this case study
is to present you with the internal details of .NET so as to put you in a better position to make use of
various tools (shipped with .NET) and techniques (for programming in .NET).

System Requirements
It is preferable that the reader has access to a compiler of one of the .NET languages, preferably C# as
programming examples are provided in C#. The programmers with good understanding of a .NET
language are in a better position to understand the internals of .NET. (The full .NET SDK can be
downloaded by following this link: http://msdn.microsoft.com/downloads/default.asp?url=/downloads/
sample.asp?url=/msdn-files/027/000/976/msdncompositedoc.xml&frame=true.

An Overview of .NET

What does the .NET Framework Provide?


The .NET Framework is aimed at providing various services to the applications and serves as a runtime
environment for execution of applications. The core philosophy of .NET is simple: provide a valuable,
common set of services for applications that can be used irrespective of the source language. .NET
supports a variety of source languages - from object oriented languages to procedural/structured
languages. The language compilers compile the code as a mix of intermediate language code (MSIL)
and metadata (data describing the type) targeting the .NET platform. The runtime of .NET takes care of
converting the code to native code for that platform and executing it. Thus, the primary objective of
.NET is to provide proper interoperability and communication between applications. Before .NET, it
was done using internal APIs, RPC (Remote Procedure Calls), OLE (Object Linking and Embedding),
and ActiveX controls. .NET integrates many such technologies and services that existed separately and
provides a unified platform.
The .NET Framework enables a number of services to deploy different kind of applications with a
reliable installation in a simplified methodology (easy deployment in the form of assemblies) and
supports full backward compatibility when you install new versions of existing components and DLLs
without any compatibility problems (referred to as versioning). This environment facilitates security
and safety by means of automatic memory management, runtime support of components, and code
verification. The support of serialization helps a wide variety of pre-existing industry open standards
like HTTP, XML, and SOAP without introducing new ones.

Common Language Runtime (CLR)


The Common Language Runtime plays a very significant role in .NET - it manages the execution of the
code and also provides many services to the code (such as garbage collection and interoperability with
COM components). Logically, it can be considered to have the following parts:
 Common Type System (CTS) and Microsoft Intermediate Language (MSIL)
 Common Language Specification (CLS)
 Virtual Execution System (VES) and the Just-In-Time (JIT) compiler
CLR is the execution engine that loads, compiles, and executes the MSIL code by converting it into
native code. It interacts with the operating system to provide the execution services for the .NET
Framework, which in turn provides services to the applications that are run on .NET. The runtime plays
many roles and provides several services: it ensures code reusability, provides automatic memory
management services to avoid memory leaks, verifies types and parameters to guarantee that the code
will not cause any security breaches, ensures consistent cross language inheritance and exception
handling, manages threads and processes, and supports developer services such as debugging (by using
a built-in stack walking facility in runtime) and profiling.

Common Type System (CTS)


The .NET platform makes it easy for code written in one language to be used in any other .NET
language (meaning that components written in one CLS compliant language could be used by
applications written in any other CLS compliant language). This interoperability could not be achieved
if every language had its own set of data types associated with it. So, the .NET Framework supports a
Common Type System (CTS) which provides a rich set of data types so that various high-level
languages can compile the source code to .NET making use of a common set of data types (so that the
components can be used easily from some other source language). In essence, CTS provides a set of
common types to ensure that components are interoperable regardless of the source language in which
the component was developed.
CTS also strictly enforces type safety - it specifies rules for type visibility, object lifetime, and for
access to the members in the assemblies (assemblies are discussed a bit later). Even an object in one
language can be passed as a parameter of a method in another language without any problem.
Language interoperability enables code reusability and improves the productivity of application
development.
Let me give a few examples. Not all languages support unsigned types - for example, Visual Basic
doesn't support unsigned types, whereas C++ does. When a component is developed using C++ making
use of unsigned types, it is difficult (and bug-prone) to make use of a component in Visual basic (as VB
doesn't support unsigned types). CTS is designed to overcome exactly these kinds of problems. CTS
does not support unsigned types (except byte type, which is unsigned by default). Other issues
regarding the use of types are also resolved. For example, the sizes of value types are predefined and
remain the same across languages. Consider the following example:
// in C#.NET
public double dbl;
// or in VB.NET
Public dbl As Double
Here, a double variable is declared in both C#.NET and VB.NET - the variable dbl resolves to the
System.Double type which occupies 8 bytes. In other words, whatever the source language, if a data
type supported by CTS is used, then the component can be used without any problem in another source
language.

CLS Compliance
In order to achieve cross-language interoperability, the languages targeting .NET should follow a set of
rules and guidelines for creating components that can be used by the code in some other source
language. Unless the code conforms to such a minimum set of rules and guidelines, the code may not
make use of the services that the .NET Framework provides. It is also not possible to ensure that the
code can be used properly in some other source language if the rules are not followed properly.
For example, a few programming languages support operator overloading, but many languages do not.
Language such as C# support operator overloading, whereas Visual J# doesn't. The component that
makes use of operator overloading may not be used properly in a language where that feature is not
supported. However, this is not a problem if the same feature is used internally. This is a significant
problem, as achieving 'language interoperability' is the main objective of .NET.
To overcome such problems, there is a common set of rules and guidelines defined to facilitate
language interoperability. A component that conforms to such a set of rules and guidelines can be used
in any .NET language and it is referred to as a 'Common Language Subset - Compliant' (CLS-
Compliant) component.

The Execution Model


To understand the internals of .NET, we need to have a very good idea of how program source code
gets finally executed in .NET. Program code may be written in any source language, and to execute it
we need a compiler for that language targeting the .NET platform (such as VB.NET and Effiel.NET
compilers). The language compiler compiles the source code to the intermediate format as assemblies
(that contain MSIL and metadata). Ultimately, the code gets executed in the .NET Framework - this is
irrespective of the source language in which the code is written.
When the code needs to be executed, the runtime takes over the control and verifies the content (for
security reasons and to make sure that the code is typesafe). Finally, the intermediate code is converted
into native code and gets executed. The code can make use of the rich class library and the services
available in the .NET Framework.
Thus the intermediate files (where the compiled code is stored) plays an important role in .NET
technology and that is covered in the following section.

PE Files and Assemblies


In .NET, the unit of deployment is the PE (Portable Executable) file - a predefined binary standard
(also referred to as Common Object File Format - COFF file format). It is made up of collection of
modules, exported types and resources and is put together as an .exe file. It is very useful in versioning,
deploying, sharing, etc. The modules in PE file are known as assemblies.

PE Files
A Portable Executable (PE) file contains MSIL code for .NET languages instead of storing assembly
language code (such as native x86 machine code). A PE file consists of a CLR header followed by
metadata and MSIL code. The CLR header is a small block of information that contains data useful to
the runtime such as the major and minor number, the module's entry point, and if the module is a CUI
or a GUI (Character or Graphical User Interface) executable, the size and offsets of certain metadata
tables contained in the module, etc.
The metadata is a block of binary data that consists of several tables for information to locate and load
classes, field references, resolve method invocations, object layout in memory, etc. The objective of the
metadata is to provide components that are self-describing. It means that you don't need an external
source to find out information about the component - all the necessary information for understanding
the use of the component is stored in the component itself. The metadata is followed by MSIL code,
which is the actual executable code, similar to assembly language code.
The PE file format (it follows COFF file format) is a flexible format. For example, PE files are capable
of storing code that can be executed under the full control of .NET (managed code) and also the native
code whose execution is not fully under the control of the .NET Framework.

Managed and Unmanaged Code


The code that targets the .NET runtime is called managed code. The managed code consists of the
MSIL and metadata - it is executed under the control of the .NET runtime. Thus the managed code is
JIT compiled and it enjoys the .NET services like garbage collection. However, for unmanaged code, it
can be directly executed on the platform and it doesn't require JIT compilation - it is also type checked.
Unmanaged code is non-portable and it cannot make use of the .NET services such as garbage
collection. You have to explicitly take care releasing the resources for unmanaged code. Why do you
need unmanaged code? There are places where low-level, platform specific services are needed. Since
the .NET runtime provides a safe and secure runtime environment where the underlying platform
specific details are not directly exposed to the applications, you cannot generate code for .NET to get
low-level services directly from the operating system. Also, legacy code (such as COM components)
isn't designed to be executed under .NET and they have native code - such code can still be used in
.NET as unmanaged code. Thus the support for unmanaged code in .NET is of great importance.
Here is an example of how unsafe code can be made use of in performance critical applications. Array
copy can be a costly operation - particularly when the array size is big. Windows kernel32 DLL
provides a low-level service to make array copy. You can make use of this CopyMemory method to
copy huge arrays to see significant improvement in the performance (however, you sacrifice the
portability of the code as it directly makes use of low-level facilities).
The TestClass.cs program illustrates how to make use of unmanaged code. The
QueryPerformanceCounter method is used to get the profiling information to calculate the time taken to
make the array copy operation. This program also makes use of the MessageBox API in user32.dll to
give results to the user. The DllImport attribute is used to indicate that the method declared is provided
in a Dll that can be loaded and executed at runtime. In the UnsafeArrayCopy method, the fixed
statement is used to make sure that the garbage collector doesn't mistakenly collect the array object
while the CopyMemory method is executing.
// Program that illustrates how to use unmanaged code
// TestClass.cs

class TestClass
{
[DllImport("Kernel32")]
private static extern bool QueryPerformanceCounter(ref long count);
[DllImport("Kernel32")]
unsafe private static extern void CopyMemory(void *dest, void *src, int length);

[DllImport("User32")]
public static extern int MessageBox(int hdl, string msg, string caption, int
type);

public static long [] UnsafeArrayCopy(long [] src)


{
long [] dst = new long[src.Length];
unsafe
{
// array copy using native code
fixed(long * srcPtr = src, dstPrt = dst)
{
CopyMemory(dstPrt, srcPtr, sizeof(long));
}
}
return dst;
}

public static long [] ArrayCopy(long [] src)


{
long [] dst = new long[src.Length];
// array copy using a looping construct instead of native code
for(int i=0; i<src.Length; i++)
dst[i] = src[i];
return dst;
}

public static long Counter()


{
long count = 0;
QueryPerformanceCounter(ref count);
return count;
}

public static void Main(string[] args)


{
long countStart = 0, countEnd = 0;
long [] srcArr = new long[short.MaxValue];
countStart = TestClass.Counter();
long [] dstArr1 = TestClass.ArrayCopy(srcArr);
countEnd = TestClass.Counter();
MessageBox(0,"Counter value for Ordinary array copy is: "+ (countEnd-
countStart),
"Timing Result", 0);

countStart = TestClass.Counter();
long [] dstArr2 = TestClass.UnsafeArrayCopy(srcArr);
countEnd = TestClass.Counter();
MessageBox(0,"Counter value for Unsafe array copy is: "+ (countEnd-
countStart),
"Timing Result", 0);
}
}
// Output from a sample run of the program
// Counter value for Ordinary array copy is: 39371
// Counter value for Unsafe array copy is: 5127
Assemblies
Assemblies are made up of one or more modules or files that make up a unit of deployment that is a
named and versioned collection of exported types and resources. Assemblies play a major role in
versioning control and security. An important characteristic of assemblies is that it is self-describing.
That is achieved by means of the metadata it contains. The metadata helps in resolving and loading
types (classes), helps the runtime to arrange object layout in memory, and resolves method invocations
- thus it makes the types self-describing. Also, metadata is XML formatted. Due to this nature, an
external mechanism like a registry is not needed to keep track of application information.
There are two types of assemblies: private and shared. A private assembly is used inside a single
application, whereas a shared assembly is meant for use by one or more applications. Assemblies are
similar to Jar files in Java (in Java, all the classes and interfaces are compiled to an intermediate format
known as class files - Jar files are used to hold those class files as a single unit).

MSIL and Metadata


MSIL is CPU-independent assembly code that is generated by any of the .NET supported languages.
MSIL consists of various instructions for creating and initializing objects, calling virtual functions,
exception handling, handling object references, strings and arrays. The metadata is generated at
compile time and it carries information about the component to the runtime. Metadata also contains a
list of references and resources the component is associated with (called manifests).
The MSIL is designed so that it can accommodate a wide range of languages. One of the important
differences between MSIL and Java bytecodes (bytecode is the intermediate language used in Java
technology) is that MSIL is type-neutral. For example, iconst_0 is a bytecode that means "push integer
value 0 on the runtime stack", meaning that the type information is kept in the bytecode itself. But the
same instruction in MSIL just says to push four bytes, meaning that no type information is passed on.
It should be noted that MSIL is not interpreted. Interpretation leads to low performance, as the code
needs to be translated each and every time the code needs to be executed. So we can use an alternative
approach of compiling the intermediate code again into native code to execute. This can make the life
of the runtime environment tougher, but can result in improving the performance of the runtime. There
are two popular approaches in converting the intermediate code to the native code - Just-In-Time (JIT)
compilation and Pre-Just-In-Time (PreJIT) compilation.

JIT and PreJIT


A Just-In-Time (JIT) compiler compiles the MSIL code to native code when a method is called and
stores it temporarily (called 'boot-strap') in memory. This compiled native code is invoked when the
same method is called the next time. So the translation of code is done only once, the first time the
method is called, and the compiled native code is called for subsequent invocations. Since, in general,
90% of the time is spent executing only 10% of the code, this approach can reap rich rewards. JIT
compilers are also referred to as "load-and-go" compilers as they don't write the target code (native
code) into a file.
Java uses this approach of compilation on demand extensively, and many of the commercial JVMs are
JITters. Note that, JVMs are allowed to follow the interpretative approach or JIT, and hence it is an
optional feature for performance improvement. For .NET, the use of JIT is mandatory and thus all
MSIL code is JITted to native code before execution.
Another approach for converting the intermediate code to native code is "Ahead-Of-Time" compilation.
When the assembly (containing the MSIL and metadata) is loaded into the memory, all the code gets
compiled into native code. This native code is called whenever required. The advantage with this
approach is that all the code is compiled before any method is called. So, Ahead-Of-Time compilation
doesn't suffer with the problem of overhead that JIT compilation does and hence it is also known as
PreJIT. In this way, PreJIT can effectively speed-up performance, as the code is ready for execution in
form of native code.
It seems that Ahead-Of-Time compiling (PreJIT) is a very good alternative for JIT, but that is not the
end of the story. The execution engine (CLR) still needs the meta-data about the classes, for example to
support dynamic features like reflection. If no meta-data is maintained, then reflection cannot be
supported which would be a serious disadvantage. Moreover, if the environment or security settings
change, then the code needs to be compiled again. Thus, PreJIT is not the best approach: by default, a
JIT compiler is used in .NET.

Debugging with ildasm tool


The intermediate language disassembler (ildasm) tool is useful for dissecting the assembly files and
seeing what is stored inside. What is the use of learning about what is stored inside the files? There are
times when you may need to look under the hood of what is actually happening in the Common
Language Runtime (CLR) and understand how the program works. At advanced levels, this
information can be crucial in debugging applications.
Here is an interesting problem that I faced when I was developing an application in C#. Below is a
much-simplified version of the actual problem - still, it is effective to illustrate how the ildasm tool can
come in handy for debugging in difficult situations.
I needed to make use of the GetInfo class in the SomeProject namespace - the code was written by
somebody else, and I didn't had access to the source code. It had a static method GetValue that returns
an object. I wanted to make use of that method in my program. The documentation of GetValue hinted
that the method typecasts some integral value to object and returns it. Since the source code is not
available, I logically arranged the available information as follows (simplified version):
namespace SomeProject
{
public class GetInfo
{
// other members
public static object GetValue()
{
// some implementation details here that is not known to
// us. Documentation tells that the method typecasts
// some integral value to 'object' and returns it
}
}
}

Now, with my program Test.cs, I tried to make use of the GetInfo.GetValue. The program terminated
with the runtime throwing InvalidCastException.
// program for demonstrating the use of IL disassembler (ildasm) for
// debugging
// Test.cs

using System;
using SomeProject;
class Test
{
public static void Main()
{
int i = (int) GetInfo.GetValue();
Console.WriteLine("value of i is {0}",i);
}
}
// program fails:
// An unhandled exception of type 'System.InvalidCastException'
// occurred in MyApplication.exe
// Additional information: Specified cast is not valid.

Why did the cast failed? I was left with no clues - fortunately, ildasm was there to rescue me. The
figure below is a screenshot of what I got when disassembling the simple GetInfo.cs program:

The code disassembled using the ildasm can be mystifying - the code is similar to the assembly code
for microprocessors. You need not understand the full code - only to the extent to help you find some
clues to debug the code. However, here I will explain the meaning of the whole code.
The executable PE file generated contains many details for proper execution of the code by the .NET
runtime. It has a manifest and the details of the GetInfo class (which includes the metadata and the
MSIL code), as shown in the screenshot. Though the program doesn't have any explicit constructors,
the compiler provides one by default. The constructor is named ctor as it is a special method (as it is
invoked automatically). Where is the compiler-provided destructor? Remember that there are no
destructors in .NET, and overriding the Finalize method provides similar functionality. The screenshot
also shows the GetValue method in its compiled intermediate form.
The names starting with "." are assembler directives (they are not executable instructions but are for
providing information to the assembler). You can modify the code obtained from the disassembler, and
assemble it (using ilasm tool) to produce the executable file that can be run by the .NET runtime.
The disassembled GetValue method contains rich information that is useful for runtime:

.method this directive refers to the fact that GetValue is a method

Public the method's access specifier - it is publicly accessible outside the class

Hidebysi refers to the fact that this method hides any method from the base class (if any) with the
g same signature

Static the method is a static method (not an instance method)

Object return the type of the GetValue method

GetValue the name and arguments (here it has no arguments) of the method
()

Cil the code is managed code (as opposed to unmanaged code)


managed

Maxstack the runtime stack size that is assumed to be virtually present for executing the IL code
Locals number and information about the local variables in the method; here it is variable srt is of
short type (i.e. System.Int16) and it also has another temporary variable used for return
value (the local variables are indexed from zero).

IL_xxxx refers to the label names for the IL instructions

Now, let us come to the actual MSIL instructions. (Note that the runtime doesn't execute this code as it
is shown here - it is converted to native code by the JIT first and only the native code is executed.)

ldc.i4. this IL instruction refers to "load constant of type integer which is a signed one" - here it is the
s variable srt. This instruction leads to pushing the integer constant 10 on the stack.

stloc.0 stands for "store the stack top value to variable at location zero", here it is integer constant 10
that is popped from the stack and stored in the short variable named srt (which is indicated as
the variable available in location 0).

ldloc. stands for "load the value of variable in location 0 to stack top". Here, the value of srt variable
0 is pushed into the stack.

box stands for "box the value in stack top to object type". Here the short value is converted to object
type and it is now available in the top of the stack.

stlock. pops the object from the stack and stores it in a temporary variable that is to be returned as the
1 return value from the method.

ldloc. the value in temporary variable is pushed into the stack.


1

ret stands for "return the control from the method".

Thus, from the MSIL code, you can figure out that the source code could have been written like this
(obviously, I have simplified the original problem to explain the idea easily):
public static object GetValue()
{
short srt = 10;
return ((object)srt);
}

There is much information available for disassembling that is not directly useful for debugging the
code. However, now it is clear why our program threw an exception. There is a clue from the
disassembled code: there is a short variable that is typecast and returned from the GetValue method.
Since we tried to downcast it to int (and not short), it resulted in throwing InvalidCastException
(remember that down casting has to be done to the exact type from which the upcast was done- failure
to do so will result in the throwing of this exception).
So the solution? Modify the code in Test.cs to either of the following:
int i = (int) (short) GetInfo.GetValue();
// or as
short i = (short) GetInfo.GetValue();
Since the downcast from the object is done to the proper type (here it is short) from which the upcast is
made, the program works fine.
As you can see, by using the ildasm tool, you can get low-level information that not only helps in
understanding the internal details but is also useful for practical purposes such as debugging the code.

Object Model
The object model refers to the way objects are allocated, arranged, and managed at runtime. The
Common Type System (CTS) supports two kinds of types: value and reference types. These two types
differ drastically in the way they are allocated, accessed, and released.

Object Allocation and Deallocation


A Garbage Collector (GC) manages the object allocation and re-collection in .NET. The runtime
manages a heap (referred to as the 'managed heap') to allocate objects of reference types (value types
are allocated on the stack itself, and that is discussed a bit later). When you need to create a new object
on the heap (using a new expression in program code), the GC looks for free space in the heap and
returns a pointer to that free location. The programmer doesn't have much control over the release of
objects and it is automatically taken care of by the GC. Thus the programmer is relieved of the tedious
process of managing the memory manually. Garbage collection is discussed in the next section of this
case study.
The code that targets the runtime is called managed code. The managed code consists of the MSIL and
metadata - it is executed under the control of the .NET runtime. Thus the managed code is JIT compiled
and it enjoys the .NET services like garbage collection. However, for unmanaged code, it can be
directly executed on the platform and it doesn't require JIT compilation - it is also type checked.
Unmanaged code is non-portable and it cannot make use of the .NET services such as garbage
collection.

Value and Reference Types


Value and reference types differ drastically in the way they are supported and implemented internally -
understanding those differences is important for proper use of these two types.
When there is a value type in a method, it is allocated on the stack (or allocated inline inside a struct)
and released automatically as the method-frame is deleted (a method-frame or stack-frame is the image
of a method created on the stack when a method is invoked). The method-frame is deleted as the
method returns control to its calling method. Value types are allocated in the method-frame and
accessed directly from the program code (without any indirect reference or handle).
For reference types, objects are always allocated in the managed heap area. A handle is also allocated
and that handle in turn points to the object allocated in the heap. The handle is useful for when the
compaction process during garbage collection is done (discussed under the section 'Garbage Collection'
later) - it is enough to change the address in the handle when the heap object is moved to another
location. Since the program code accesses the object through a handle, it continues to work properly as
all the accesses to the object are made only through the handle. However, allocating and maintaining a
handle for each heap object has both space and time overheads. The space for the handle should be
allocated in the heap and each access to the object is done with an extra level of indirection (i.e.
through a handle instead of direct access).
Thus it is clear that using value types is much more efficient than using reference types. So, whenever
possible, make use of value types, and for small classes (for example, a Point class representing
Cartesian co-ordinates) it is better to go for structures, as structs are value types. Since value types are
allocated on the stack, when a struct is copied to another, a 'copy' of the whole struct is made, whereas
when a reference type is assigned to another, only the 'reference' is copied, and both the references will
then point to the same object (referred to as 'deep copy' and 'shallow copy' respectively). A reference
type can hold null when it doesn't point to any object in the heap. Value types cannot hold null (as there
is no reference available to hold the null value) as they are allocated in stack and it is possible to only
set the struct members to default values.

Boxing Operations and its Cost


.NET supports a common base class (System.Object) from which all the objects are derived. However,
not all the objects are managed in the same way, as we learnt from the differences in the way values
and reference types are treated by the runtime. Thus, a conversion from a value type to a reference type
(or in the other way) is not simple - it is a costly operation. Let us see how such conversions are
handled internally and why it is a costly process.
The conversion from a value type to a reference type is referred to as 'boxing' (as a wrapper object is
created internally) and the conversion in the other direction is called as 'unboxing'. It is always possible
to convert a value type to a reference type and hence an implicit conversion is enough. But to go from a
reference type to a value type, the heap object may or may not have a valid conversion to the value type
- so you need an explicit conversion. For example:
// implicit conversion from System.Int32 (value type) to System.Object
// (reference type) this conversion will never fail,
// so an implicit conversion is enough
object obj = 10;

// obj may contain any other reference type such as a string - so an explicit cast
is required for an upcast
int i = obj;
// error - explicit cast required

int i = (int) obj;


// OK, as explicit cast is present, still it may throw an exception
// at runtime as obj may contain any other reference type such as a string

Boxing and unboxing have corresponding MSIL instructions 'box' and 'unbox'. When the 'box'
instruction is executed, three operations take place:
 the object is allocated on the heap which is pointed to by a handle from the stack frame
 the value stored in the value type is copied into the heap object
 the handle is returned
When an 'unbox' instruction is executed, three operations take place:
 if a valid conversion is available from the heap object (reference type) to the value type, the
conversion is made
 if the conversion fails, an exception (InvalidCastException) is thrown
 the handle and the heap object become garbage (if no other references are referring to that
object) and the garbage collector may collect it later.
As you can see, when boxing is done, it involves allocation of heap memory and the allocated object is
also initialized. When the unboxing operation is done, it leaves the heap memory to be garbage
collected. Such allocation, initialization, and garbage collection of heap memory operations are all
costly operations than can affect the performance of the program. Since the boxed objects tend to be
small, it contributes to the fragmenting of the memory into small pieces - this may force the garbage
collector thread to be invoked more times. Thus, repeated (and unnecessary) boxing and unboxing
conversions can have an adverse impact on performance. Whenever possible, avoid boxing and
unboxing operations.
Let me give an example for reducing the number of boxing and unboxing operations. If you use struct
types extensively in collection classes, it can harm the performance of the code. Since structs are value
types, you have to do boxing (and later unboxing to get the original struct back) to make it possible to
store them in collection classes (as the Boxing.cs program shows with profiling code.) Note that,
making use of structs actually improves performance, as the extensive heap allocation/ garbage
collection overhead is not there for structs. The point here is to avoid structs to be used with many
boxing (and unboxing) operations (such as using them extensively with collection classes) as frequent
boxing/unboxing can affect the performance of the code. Thus, a careful decision needs to be made to
decide if a struct or a class needs to be used for small objects - or else it will adversely affect the
performance (as you can see from the sample output of the Boxing.cs program).
// Program that illustrates how extensive boxing (and unboxing) operations
// can adversely affect the performance of the code
// Boxing.cs

using System;
using System.Collections;
using System.Runtime.InteropServices;

// Point as a structure (value type)


struct PointStruct
{
int x;
int y;
// other members
}

// Point as a class (reference type)


class PointClass
{
int x;
int y;
// other members
}

class TestClass
{
[DllImport("Kernel32")]
private static extern bool QueryPerformanceCounter(ref long count);

public static void StackForStructs()


{
Stack stk = new Stack();
for(int i=0; i<byte.MaxValue; i++) // for some number of times
stk.Push(new PointStruct());
// note that boxing is done here since struct value is pushed
}

public static void StackForClass()


{
Stack stk = new Stack();
for(int i=0; i<byte.MaxValue; i++) // for some number of times
stk.Push(new PointClass());
// note that no boxing is done here as a class object is pushed
}

public static long Counter()


{
long count = 0;
QueryPerformanceCounter(ref count);
return count;
}

public static void Main(string[] args)


{
long countStart = 0, countEnd = 0;
countStart = TestClass.Counter();
TestClass.StackForStructs();
countEnd = TestClass.Counter();
Console.WriteLine("Counter value for stack pushes with boxing is: "+
(countEnd-countStart));

countStart = TestClass.Counter();
TestClass.StackForClass();
countEnd = TestClass.Counter();
Console.WriteLine("Counter value for stack pushes without boxing is: "+
(countEnd-countStart));
}
}

// Counter value for stack pushes with boxing is: 44299


// Counter value for stack pushes without boxing is: 2321

In the Boxing.cs program, we are using a PointStruct (a value type) and PointClass (a reference type).
Boxing needs to be done for value types for storing them in a collection class whereas reference type
objects doesn't need boxing to be done - that is the basic idea behind the program.
The code makes use of a Stack collection class to create and push PointStruct objects repeatedly.
Similarly, PointClass objects are created and the handles are pushed onto the stack. The Main method
keeps track of the time required to perform the operations (similar to the code used in the TestClass.cs
program). Once the sample has run, you can see the extent to which the performace suffers because of
repeated boxing operations (for pushing the value type objects onto the stack).

Garbage Collection

Why understand garbage collection?


In conventional programming languages like C (and C++), the programmer has to take care of
allocating and freeing resources. Manual memory management can lead to problems such as dangling
references (using the reference after the object is released) and memory leaks (objects that are not
accessible from program code). Garbage Collection (GC) eliminates these problems by automatically
recollecting the 'useless' objects and lets us concentrate on solving the problem at hand instead of
worrying about managing the memory. In .NET, GC works for all managed objects and the GC thread
runs when the managed heap is full or requires more memory for allocation.
As GC automates releasing the memory, why should we know how GC works? Yes, it is true that we no
longer need to worry about memory management. However, GC is only for memory, and we are still
responsible for properly releasing the other resources such as file handles and database connections.
Also, without proper understanding of how garbage collection works, we will not be able to
programmatically control the GC behavior (it is discussed a bit later in this section).

Reference Counting GC
Advanced programmers may be aware of reference counting - a garbage collection algorithm which is
also a useful programming technique. Each of the objects allocated has a counter (called as 'reference
counter') associated with it. Whenever the object is assigned to some reference, its reference count is
increased by one (the reference count indicates the number of references pointing it). Whenever the
reference goes out of scope, assigned null, or points to some other object, the reference count is
decreased by one. Thus, when the reference count for an object becomes zero, it means that the object
is ready for re-collection and that it can be re-allocated whenever required.
There are many practical uses for reference counting; for example, you can maintain pool of objects
(called as 'object pooling') for object reuse. Initially you can obtain a certain fixed number of objects,
and whenever required, you can allocate the objects from that pool. When use of an object is over, you
can return it back to the object pool for reusing it later. Thus you can minimize unnecessary (and
frequent) allocation and release of objects, thus improving overall performance of the program. You
can make use of this technique irrespective of the source language in which you program. This
algorithm is not used by the .NET runtime for garbage collection - rather 'generational garbage
collection algorithm' is used, which we will now discuss.

Generational GC
The Microsoft implementation of .NET has a generational garbage collector that runs as a low-priority
thread. The garbage collector starts from the root object, which includes global, static objects and CPU
registers, and traverses through all the objects that are referenced recursively. Metadata information is
also used to determine which members of an object refer to other objects. By this process, it identifies
all the objects that are allocated and still in use. All the objects that are not accessible by the application
(i.e. they are garbage) can be re-collected from the managed heap.
Separating the useful (referred to as 'live objects') objects from the garbage comes with another
problem - the heap memory becomes fragmented with the mix of live objects and garbage. A technique
called 'compaction' combines all the live objects to a common part of the heap. Once the heap memory
is no longer fragmented, it is in a better position to satisfy requests for the application to allocate new
objects.
As you can see, this memory re-collection process can take a lot of time - particularly locating all the
live objects to separate them from garbage and the copying of live objects (for compaction) can take up
much of the processor time. To avoid unnecessarily referring all the objects and copying, the scope and
lifetime of the objects also needs to be considered. Empirical data shows that recently allocated objects
(and temporary objects) can be expected to become garbage soon, whereas global and static objects
remain for some long time. This important factor is taken into account by dividing the managed heap
into three 'generations'.
A generation is a heap of memory with each generation relating to the lifetime of objects. First, the
local objects are named generation 0, and when garbage collection needs to run, recollection of objects
is done only on this generation. The objects that survive at the end are marked as generation 1. Another
part of the heap now becomes generation 0 and objects are allocated from that part. The garbage
collector first looks for free space in generation 0 and if it is not enough, it looks for free space in
generation 1. The objects that are live after the collection process in generation 1 are move to
generation 2. Thus, all the heap memory is not accessed and compacted by the garbage collector - the
GC runs on only a part of the heap. Thus, the garbage collection process leads to better performance by
separating the heap as generations based on the time period for which the object 'lives'.

Resource Finalization
Remember that garbage collection is only for objects and not for external resources that an object may
hold, such as file handles and database connections. Before an object is garbage collected, such
resources need to be released properly. Since explicit memory deallocation are not in .NET (garbage
collection takes care of that), it is not possible to support C++ like destructors in .NET. Then how to
release external resources? The code for releasing such resources can be done by overriding the
Finalize method of the System.Object class. Then how is it possible that there are destructors in
C#.NET?, you may ask. The compiler internally converts the destructors in C# to Finalize methods.
If the object that is identified as garbage overrides the Finalize method, then it needs to be called before
that garbage object is returned for reallocation. Internally, the runtime places an entry into a list known
as 'finalization list'. After GC occurs, the finalization thread wakes up and calls all the Finalize methods
before cleaning up the objects. Thus, the support for these Finalization methods makes the garbage
collection process a little complicated and is an overhead to the GC process:
 A Separate thread is need for executing Finalize methods
 Creation and re-collection of the objects takes a little more time. As an entry needs to be made
in the finalization list and since it requires an additional cycle to collect objects on the
finalization list, it takes more time for both creation and re-collection of objects with Finalize
method
 It is not assured that the Finalize methods will indeed be called by the runtime, in some cases
(such as abnormal program termination), the methods may not be called at all
 The garbage collector needs to maintain the finalization list
 The order of calling finalization methods cannot be predicted
As you can see, there are many practical problems in using the Finalize method for releasing external
resources. External resources are generally available in limited numbers (such as the limitation on the
number of open files, database handles, etc), so it is essential that such resources be immediately
released after use. However, you have a problem here: you cannot predict when the garbage collector
will call the Finalize methods (this problem is called 'non-deterministic finalization' in technical
jargon). Deterministic finalization (i.e. releasing the resources immediately after its use) can be
achieved by overriding the IDisposable.Dispose method.

Deterministic Finalization
The Dispose method in the IDisposable interface can be used for programmatically releasing the
external resources. Even if you forget (or purposefully ignore) to call the Dispose method explicitly, the
runtime will take care of calling the destructor (internally Finalize method) and release the resources.
The program Dispose.cs illustrates how to implement the Dispose method. The TextReader is a class
useful for reading textual input, and here it is used to read the text input from a file. So, the file handle
is a resource here and it should be properly returned back to the operating system as soon as its use is
over. A constructor is used to open the file. The text file is assumed to contain some information and a
line of text is read in the foo method.
Now, it is time to implement the Dispose method. Remember the following facts:
 Dispose method is called explicitly from the program.
 If the programmer forgets to call the Dispose method, the destructor should release the resource.
So, to satisfy these two requirements, it is wise to have a separate method to release the resources and
call it both from the Dispose method and the destructor. It is an error to release the same resource
twice. Since the Dispose method is called explicitly to release the resources, you have to take care that
the destructor should not be called again to release the same resources. How to disable calling the
destructor? It is achieved by calling the GC.SuppressFinalize method. This method will remove the
object from the finalization list in the garbage collector and so, the Finalize method will not be called
for the object when it is garbage collected (the use of GC.SuppressFinalize method is discussed a bit
later in this section).
The driver code inside the Main method tests if we have properly implemented deterministic
finalization using the Dispose method. It checks by calling the Dispose method once for an object and
by allowing the destructor to release the resource automatically for another object.
// program for demonstrating deterministic finalization with Dispose method
// Dispose.cs
using System;
using System.IO;

public class Test: IDisposable


{
TextReader tr;
Test()
{
tr = new StreamReader(@"D:\GANESH\TEXT.TXT");
// open other resources here
}
void foo()
{
string str = tr.ReadLine();
Console.WriteLine(str);
}
public void CloseFile()
{
tr.Close();
// close resources here
}
public void Dispose()
{
Console.WriteLine("Inside the Dispose Method");
CloseFile();
// since the dispose is called, disable finalize method
GC.SuppressFinalize(this);
// removes the finalize method from the finalize list
}
~Test()
{
Console.WriteLine("Inside the Destructor");
CloseFile();
}
public static void Main()
{
Test t = new Test();
t.foo();
t.Dispose();
// call Dispose explicltly to release resources
// prints the contents of the file
// and after that the message "Inside the Dispose Method"

t = new Test();
t.foo();
// Don't call Dispose expliclty...
// destructor is invoked automatically to release the resources
// prints the contents of the file
// and after that the message "Inside the Destructor"

}
}

Programmatically controlling the GC behavior


There are situations in which it is necessary to control the GC behavior. In these situations, the
System.GC class provides facilities to control the GC behavior. For example, in environments where
the resources are constrained, it may be required to call the GC explicitly so that the garbage can be
collected and the memory will be available for reallocation. You can make use of the
System.GC.Collect method to get the garbage collector thread scheduled to run. (Note that it is only
scheduled to run and it need not be called immediately as garbage collector is a thread and should work
properly with other threads of the application).
System.GC.SupressFinalization helps to remove the entry from the finalization list of the garbage
collector, which is placed when the object is created. By calling this method, you can avoid
unnecessary calls to the Finalize method. For one of its uses in implementing the Dispose method, see
"Deterministic Finalization", later in this section.
// program for demonstrating the use of GC.SuppressFinalize method
// Finalize.cs

using System;
using System.IO;

public class Test


{
void foo()
{
GC.SuppressFinalize(this);
// removes the finalize method from the finalize list
}
~Test()
{
Console.WriteLine("Inside the Destructor");
}
public static void Main()
{
(new Test()).foo();
// for this Test object, destructor will not be called.
new Test();
// for this Test object, destructor is called
// prints the message "Inside the Destructor" in the screen
}
}
In this example, the foo method calls the method GC.SuppressFinalize to instruct the garbage collector
to remove the finalize method for that object from the finalize list. So, the destructor (which is
internally a Finalize method) will not be called for this object. If you don't call the foo method, the
destructor for that object will be called. The Main method illustrates that by creating two objects and
calling the foo method on one object.

Harnessing .NET
Are there any practical uses in learning the internals of .NET? Yes, there are. You can make best use of
many tools and facilities supported in .NET if you have a good understanding of the internals of .NET.
The important advanced tools (all shipped with Microsoft Visual Studio .NET and usually located in "\
Program Files\Microsoft.Net\FrameworkSDK\Bin") that you can make best use of when you have
knowledge about .NET internals are:
 The assembly linker (al.exe) helps to link various assemblies into single assembly
 The intermediate language assembler (ilasm.exe) for assembling the MSIL code
 The intermediate language disassembler (ildasm.exe) for disassembling and viewing the MSIL
code and examining metadata (debugging using ildasm tool is covered under the section "PE
Files and Assemblies")
 The native image generator (ngen.exe) that supports PreJIT (refer section on 'PE Files and
Assemblies')
Coming to facilities, the important advanced facilities supported by .NET are:
 The reflection facility (the System.Reflection namespace) to load, examine, and execute
assemblies dynamically. You can also create dynamic assemblies using the classes in the
System.Reflection.Emit namespace. You can create assemblies, load them, and execute them -
all dynamically. Note that the assemblies created by compiling with a language compiler are
referred to as static assemblies.
This section covers the reflection facility to show how knowledge about the internal details of .NET
can be useful in practice. When you understand the internals of .NET, you are in a better position to
understand and use reflection. This is because reflection involves low-level facilities such as the
internal details of assemblies, types, and metadata.

Using Reflection
Reflection is a very powerful and flexible feature available in .NET. Reflection provides the ability for
the program to analyze itself, manipulate code dynamically, and even create new types and use them at
runtime (referred to as execution time code generation). Reflection has excellent use in case of system
tools like code analyzers and debuggers.
Attributes are a very useful feature supported in C# that are not generally available in other languages.
By using reflection, you can view the custom attributes used in a class. The idea is to load the class at
runtime, analyze its contents by getting the information about the class, get the attribute information
stored in various parts of the code, and display it on the screen for reading the information obtained.
Without such a facility, we have to resort to reading documentation, which may be a tedious in practice.
Whereas, with reflection, we can readily examine any assembly and see what are the custom attributes
used in the code.
The following C# code illustrates how a custom attribute type can be created and used. It shows how to
employ reflection to get information about the class and the attribute information stored in it. An
attribute for managing information on code maintenance is a widely used and practical example for use
of custom attributes.
// Code for creating a new custom attribute named CodeMaintenanceAttribute
// CodeMaintenanceAttribute.cs

[AttributeUsage(AttributeTargets.Method)]
class CodeMaintenanceAttribute: Attribute
{
public CodeMaintenanceAttribute(string author, string modifiedDate, string
comment)
{
this.author = author;
this.modifiedDate = modifiedDate;
this.comment = comment;
}

private string author;


private string modifiedDate;
private string comment;

public string Author


{
get
{
return author;
}
set
{
author = value;
}
}

public string ModifiedDate


{
get
{
return modifiedDate;
}
set
{
modifiedDate = value;
}
}

public string Comment


{
get
{
return comment;
}
set
{
comment = value;
}
}
}

// Code for making use of custom attribute. Also, code using reflection
// to examine and display such custom attribute information

// Reflection.cs

using System;
using System.Reflection;

class ReflectionClass
{
[CodeMaintenance("John","3-3-2002","Updated this thing")]
public void SomeMethod()
{
}

[CodeMaintenance("Rajiv","4-3-2002","Updated that thing")]


public void SomeAnotherMethod()
{
}

public static void Main()


{
foreach(MethodInfo mtd in typeof(ReflectionClass).GetMethods())
{
foreach(Attribute attr in mtd.GetCustomAttributes(true))
{
CodeMaintenanceAttribute cma = attr as CodeMaintenanceAttribute;
Console.WriteLine("Code Maintenance info for method : "+ mtd);
Console.WriteLine("Author of the code change : " + cma.Author);
Console.WriteLine("Modified on : " + cma.ModifiedDate);
Console.WriteLine("Comment on modification : " + cma.Comment);
}
}
}
}

// Sample output
// Code Maintenance info for method : Void SomeMethod()
// Author of the code change : John
// Modified on : 3-3-2002
// Comment on modification : Updated this thing

// Code Maintenance info for method : Void SomeAnotherMethod()


// Author of the code change : Rajiv
// Modified on : 4-3-2002
// Comment on modification : Updated that thing

Now let's look at how the code works. The program CodeMaintenanceAttribute.cs creates a custom
attribute to maintain code maintenance information (the author name, the date when the change is
made, and any comments on the change). It is written following standard conventions for writing a
custom attribute class.
The program ReflectionClass.cs has two instance methods that make use of the
CodeMaintenanceAttribute. The Main method is used to find the attribute information.
The GetMethods method requires a type to operate on - here it is a ReflectionClass type that is to be
examined. The typeof operator returns a Type object that is used for getting the method information
(you can also obtain the runtime type information of the class ReflectionClass using the GetType
method provided in Type class). Classes such as MethodInfo and Attribute are available in the
System.Reflection namespace, and are capable of storing the information about the methods in a class
and custom attributes respectively. GetCustomAttributes is a method available in the MethodInfo class
which is used to get information about the custom attributes (if any are available). The remaining code
is simple - it just displays the attribute information in the screen.
As you can see with this coding example, reflection is a very powerful mechanism to make use of.

Case Study Review


The design of the .NET Framework is based on the concept of language interoperability. CLR is the
execution engine of .NET, and CTS serves as glue for supporting a wide variety of languages. The
runtime supports Just-In-Time compilation for faster execution by compiling the intermediate code to
native code.
The language compilers generate PE files and assemblies irrespective of the source language, and the
program is represented in a stack based MSIL code and metadata for describing the program to
runtime. JIT and PreJIT are two major ways of converting the intermediate code to native code for
execution. You can make use of ILDASM tool for debugging applications and also to learn how the
CLR works.
Object allocation is in the control of the application, whereas the deallocation is taken care of by the
garbage collector. The value types are allocated in stack and reference types in heap - this is the cause
of major the difference in the performance of using these types. Boxing and unboxing are costly
operations as it involves creation, initialization, and garbage collection of heap objects.
Garbage collector is a low-priority thread that runs to re-collect unreferenced objects. The .NET GC
follows the concept of generations for efficient collection of objects. Resource finalization complicates
the GC process. For releasing external resources promptly after use, deterministic finalization in the
form of Dispose method should be used. You can programmatically control the GC to some extent by
using the facilities provided in System.GC class.
There are many practical uses when you understand the internals of .NET. You can use many tools
provided in .NET, and some facilities such as reflection can be used in a better way when you have
some knowledge about its internals. Reflection is a powerful capability and has invaluable use in
writing system tools for .NET.
This case study provided a programmer's introduction to the .NET Common Language Runtime (CLR),
internals of .NET in general, and shows how such knowledge is valuable in programming for the .NET
Framework.

Any Limitations or Further Work


This case study can serve as a starting point for learning the use of advanced tools shipped with .NET.
It is left for the reader to learn advanced programming aspects such as dynamic code emission - this
case study provides the base for learning it.

All rights reserved. Copyright Jan 2004.

You might also like