NET Framework Developer's Guide

Compiling MSIL to Native Code
Before you can run Microsoft intermediate language (MSIL), it must be compiled against the common language runtime to native code against the common language runtime for the target machine architecture. The .NET Framework provides two ways to perform this conversion:

• •

A .NET Framework just-in-time (JIT) compiler. The .NET Framework Native Image Generator (Ngen.exe).

Compilation by the Just-in-time Compiler JIT compilation converts MSIL to native code on demand at application run time, when the contents of an assembly are loaded and executed. Because the common language runtime supplies a JIT compiler for each supported CPU architecture, developers can build a set of MSIL assemblies that can be JIT-compiled and run on different computers with different machine architectures. However, your managed code will run only on a specific operating system if it calls platform-specific native APIs, or a platform-specific class library. JIT compilation takes into account the fact that some code might never get called during execution. Rather than using time and memory to convert all the MSIL in a portable executable (PE) file to native code, it converts the MSIL as needed during execution and stores the resulting native code in memory so that it is accessible for subsequent calls in the context of that process. The loader creates and attaches a stub to each method in a type when the type is loaded and initialized. When a method is called for the first time, the stub passes control to the JIT compiler, which converts the MSIL for that method into native code and modifies the stub to point directly to the generated native code. Subsequent calls to the JIT-compiled method therefore proceed directly to the native code. Install-Time Code Generation Using NGen.exe Because the JIT compiler converts an assembly's MSIL to native code when individual methods defined in that assembly are called, it necessarily involves a performance hit at run time. In most cases, that performance hit is acceptable. More importantly, the code generated by the JIT compiler is bound to the process that triggered the compilation. It cannot be shared across multiple processes. To allow the generated code to be shared across multiple invocations of an application or across multiple processes that share a set of assemblies, the common language runtime supports an ahead-of-time compilation mode. This ahead-of-time compilation mode uses the Native Image Generator (Ngen.exe) to convert MSIL assemblies to native code much like the JIT compiler does. However, the operation of Ngen.exe differs from that of the JIT compiler in three ways:

It performs the conversion from MSIL to native code before rather than while running the application.

It compiles an entire assembly at a time, rather than a method at a time.

It persists the generated code in the Native Image Cache as a file on disk.

Code Verification As part of compiling MSIL to native code, the MSIL code must pass a verification process unless an administrator has established a security policy that allows the code to bypass verification. Verification examines MSIL and metadata to find out whether the code is type safe, which means that it only accesses the memory locations it is authorized to access. Type safety helps isolate objects from each other and therefore helps protect them from inadvertent or malicious corruption. It also provides assurance that security restrictions on code can be reliably enforced. The runtime relies on the fact that the following statements are true for code that is verifiably type safe:

• • •

A reference to a type is strictly compatible with the type being referenced. Only appropriately defined operations are invoked on an object. Identities are what they claim to be.

During the verification process, MSIL code is examined in an attempt to confirm that the code can access memory locations and call methods only through properly defined types. For example, code cannot allow an object's fields to be accessed in a manner that allows memory locations to be overrun. Additionally, verification inspects code to determine whether the MSIL has been correctly generated, because incorrect MSIL can lead to a violation of the type safety rules. The verification process passes a well-defined set of type-safe code, and it passes only code that is type safe. However, some type-safe code might not pass verification because of some limitations of the verification process, and some languages, by design, do not produce verifiably type-safe code. If type-safe code is required by the security policy but the code does not pass verification, an exception is thrown when the code is run.

.NET Memory Management and Garbage Collection
C and C++ programs have traditionally been prone to memory leaks because developers had to manually allocate and free memory. In Microsoft® .NET, it is not necessary to do this because .NET uses garbage collection to automatically reclaim unused memory. This makes memory usage safer and more efficient. The garbage collector (GC) uses references to keep track of objects that occupy blocks of memory. When an object is set to null or is no longer in scope, the GC marks the object as reclaimable. The GC can return the blocks of memory referenced by these reclaimable objects to the operating system. The performance benefit of a GC arises from deferring the collection of objects, as well as from performing a large number of object collections at once. GCs tend to use more memory than typical memory management routines, such as those used by the Windows-based operating systems.

The GC in .NET uses the Microsoft Win32® VirtualAlloc() application programming interface (API) to reserve a block of memory for its heap. A .NET managed heap is a large, contiguous region of virtual memory. The GC first reserves virtual memory, and then commits the memory as the managed heap grows. The GC keeps track of the next available address at the end of the managed heap and places the next allocation request at this location. Thus, all .NET managed memory allocations are placed in the managed heap one after another. This vastly improves allocation time because it isn't necessary for the GC to search through a free list or a linked list of memory blocks for an appropriately sized free block, as normal heap managers do. Over time, holes begin to form in the managed heap as objects are deleted. When garbage collection occurs, the GC compacts the heap and fills the holes by moving allocations using a straight memory copy. Figure 2.1 shows how this works.

Figure 2.1. How the garbage collector compacts the heap For more details on the .NET garbage collection mechanism, see the following references:

"Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework", by Jeffrey Richter, MSDN Magazine, November 2000. (

"Garbage Collection—Part 2: Automatic Memory Management in the Microsoft .NET Framework", by Jeffrey Richter, MSDN Magazine, December 2000. (

Chapter 19, "Automatic Memory Management (Garbage Collection)" in Applied Microsoft .NET Framework Programming by Jeffrey Richter (Microsoft Press, 2002).

The GC improves memory management performance by dividing objects into generations based on age. When a collection occurs, objects in the youngest generation are collected. If this does not free enough memory, successively older generations can also be collected. The use of generations means that the GC only has to work with a subset of the allocated objects at any one time. The GC currently uses three generations, numbered 0, 1, and 2. Allocated objects start out belonging to generation 0. Collections can have a depth of 0, 1, or 2. All objects that exist after a collection with a depth of 0 are promoted to generation 1. Objects that exist after a collection with a depth of 1, which will collect both generation 0 and 1, move into generation 2. Figure 2.2 shows how the migration between generations occurs.

Figure 2.2. Migration between generations during multiple collections Over time, the higher generations are filled with the oldest objects. These higher generations should be more stable and require fewer collections; therefore, fewer memory copies occur in the higher generations. Collection for a specific generation occurs when the memory threshold for that generation is hit. In the implementation of .NET Version 1.0, the initial thresholds for generations 0, 1, and 2 are 256 kilobytes (KB), 2 megabytes (MB), and 10 MB, respectively. Note that the GC can adjust these thresholds dynamically based on an application's patterns of allocation. Objects larger than 85 KB are automatically placed in the large object heap, which is discussed later in this chapter.

The GC uses object references to determine whether or not a specific block of memory in the managed heap can be collected. Unlike other GC implementations, there is not a heap flag on each allocated block indicating whether or not the block can be collected. For each application, the GC maintains a tree of references that tracks the objects referenced by the application. Figure 2.3 shows this tree.

Figure 2.3. Root reference tree The GC considers an object to be rooted if the object has at least one parent object that holds a reference to it. Every application in .NET has a set of roots, which includes global and static objects, as well as associated thread stacks and dynamically instantiated objects. Before performing a garbage collection, the GC starts from the roots and works downward to build a tree of all variable

references. The GC builds a master list of all live objects, and then walks the managed heap looking for objects that are not in this live object list. This would appear to be an expensive way of determining whether or not an object is alive, compared with using a simple flag in a memory block header or a reference counter, but it does ensure complete accuracy. For example, an object reference counter could be mistakenly overreferenced or under-referenced, and a heap flag could be mistakenly set as deleted when there are live references to the memory block. The managed heap avoids these issues by enumerating all live objects and building a list of all referenced objects before collection. As a bonus, this method also handles circular memory reference issues. If there is a live reference to an object, that object is said to be strongly rooted. .NET also introduces the notion of weakly rooted references. A weak reference provides a way for programmers to indicate to the GC that they want to be able to access an object, but they don't want to prevent the object from being collected. Such an object is available until it is collected by the GC. For example, you could allocate a large object, and rather than fully deleting and collecting the object, you could hold onto it for possible reuse, as long as there is no memory pressure to clean up the managed heap. Thus, weak references behave somewhat like a cache.

Large Object Heap
The .NET memory manager places all allocations of 85,000 bytes or larger into a separate heap called the large object heap. This heap consists of a series of virtual memory blocks that are independent from the main managed heap. Using a separate heap for larger objects makes garbage collection of the main managed heap more efficient because collection requires moving memory, and moving large blocks of memory is expensive. However, the large object heap is never compacted; this is something you must consider when you make large memory allocations in .NET. For example, if you allocate 1 MB of memory to a single block, the large object heap expands to 1 MB in size. When you free this object, the large object heap does not decommit the virtual memory, so the heap stays at 1 MB in size. If you allocate another 500-KB block later, the new block is allocated within the 1 MB block of memory belonging to the large object heap. During the process lifetime, the large object heap always grows to hold all the large block allocations currently referenced, but never shrinks when objects are released, even if a garbage collection occurs. Figure 2.4 shows an example of a large object heap.

Figure 2.4. Large object heap

Binding Policy in .NET
by Mike Gunderloy 03/17/2003 So you're ready to deploy version 1.1 of your library, and you'd like it to replace version 1.0 for existing applications. Or perhaps something else has globally upgraded to 1.1, and you need to downgrade it for a particular application where 1.1 is causing problems. Handling these issues for .NET applications is the job of runtime binding policy. In this article, I'll explain the basics of runtime binding policy, and show you how you can customize the process for your own applications.

.NET applications use two different types of assemblies: private assemblies and shared assemblies. Private assemblies are identified by their name and are deployed for the use of only a single application. Shared assemblies are identified by a strong name (a type of digital identity that includes the name of the assembly, its version number, its culture identity, and a public key token). When you build an assembly, information about all other assemblies that it refers to is stored in the assembly manifest. The manifest, however, does not store the exact path of the assembly because this path may might differ on the computer where the assembly is deployed. At runtime, when a class is referenced, the Common Language Runtime (CLR) reads the assembly manifest, retrieves the identification information for the referenced assembly, and then attempts to locate the referenced assembly. The mechanism used by the CLR to locate a private assembly is different from that used for a shared assembly. It's easy for the CLR to tell the difference, because public key tokens are only stored in the manifest for shared assemblies.

Binding Policy for Private Assemblies
Here are the steps that the CLR takes when you call code from a private assembly: 1. 2. The CLR uses the manifest to determine the name of the requested asembly. The CLR checks to see whether the requested assembly has already been loaded. If it has, then the CLR binds to the loaded copy and stops searching. The CLR checks your application's configuration file to see whether it contains any path hints. Path hints are stored in the <probing> element, as in this example:


4. 5. <?xml version="1.0"?> 6. <configuration> 7. <runtime> 8. <assemblyBinding 9. xmlns="urn:schemas-microsoft-com:asm.v1"> 10. <probing privatePath="bin\path1;bin\path2" />

11. </assemblyBinding> 12. </runtime> 13.</configuration>
This particular example adds the bin\path1 and bin\path2 folders to the list that the CLR checks for private assemblies. 14. The next step depends on whether the referenced assembly is for a particular culture. Either way, the CLR checks a list of locations for the assembly, but the list differs. If there is no culture information, then the search order is as follows: o ApplicationBase\AssemblyName.dll o ApplicationBase\AssemblyName\AssemblyName.dll o ApplicationBase\PrivatePath1\AssemblyName.dll o ApplicationBase\PrivatePath1\AssemblyName\AssemblyName.dll o ApplicationBase\PrivatePath2\AssemblyName.dll o ApplicationBase\PrivatePath2\AssemblyName\AssemblyName.dll] o ApplicationBase\AssemblyName.exe o ApplicationBase\AssemblyName\AssemblyName.exe o ApplicationBase\PrivatePath1\AssemblyName.exe o ApplicationBase\PrivatePath1\AssemblyName\AssemblyName.exe o ApplicationBase\PrivatePath2\AssemblyName.exe o ApplicationBase\PrivatePath2\AssemblyName\AssemblyName.exe If there is culture information for the target assembly, then the search list is a bit different:

o o o o o o o o o o o o

ApplicationBase\Culture\AssemblyName.dll ApplicationBase\Culture\AssemblyName\AssemblyName.dll ApplicationBase\PrivatePath1\Culture\AssemblyName.dll ApplicationBase\PrivatePath1\Culture\AssemblyName\AssemblyName.dll ApplicationBase\PrivatePath2\Culture\AssemblyName.dll ApplicationBase\PrivatePath2\Culture\AssemblyName\AssemblyName.dll] ApplicationBase\Culture\AssemblyName.exe ApplicationBase\Culture\AssemblyName\AssemblyName.exe ApplicationBase\PrivatePath1\Culture\AssemblyName.exe ApplicationBase\PrivatePath1\Culture\AssemblyName\AssemblyName.exe ApplicationBase\PrivatePath2\Culture\AssemblyName.exe ApplicationBase\PrivatePath2\Culture\AssemblyName\AssemblyName.exe

Here, ApplicationBase is the directory in which the requesting application is installed, AssemblyName is the name of the assembly to locate, Culture is the culture code for the target assembly, and PrivatePath1 and PrivatePath2 are the hints provided in the <probing> element of the application configuration file. Of course, if there are more than two hints, each one is searched in turn. As soon as the CLR finds a matching assembly, it binds to the assembly and stops searching. 15. If none of these locations yields a copy of the assembly, then the binding process fails and your application won't run.

Binding Policy for Shared Assemblies
Here are the steps that the CLR takes when you call code from a shared assembly:

1. 2. 3.


The CLR determines the correct version of the assembly to load by examining the applicable configuration files. I'll explain this process in the next section of this article. The CLR checks to see whether the requested assembly has already been loaded. If it has, then the CLR binds to the loaded copy and stops searching. The CLR then checks for the requested assembly in the Global Assembly Cache (GAC). If the assembly is in the GAC, then the CLR uses that copy and stops searching. The CLR next looks for a <codebase> element in the application's configuration file. If one is present, it checks that path for the assembly, loading it if found. If the requested assembly hasn't been located yet, the CLR proceeds to search for it as if it were a private assembly, following the rules from the previous section.

Determining the Proper Version
Here's where it gets fun. Remember that I started the article by discussing scenarios in which you might like to customize the binding process? You can do this with a series of configuration files that allow you to tell an application to use a particular version of a shared assembly, even if it was compiled with a different version. These binding policies are applied at three levels: 1. 2. 3. Application Policy Resolution Publisher Policy Resolution Administrator Policy Resolution

In the application policy resolution stage, the CLR checks for a <bindingRedirect> tag in the application's configuration file, similar to this example:

<?xml version="1.0"?> <configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="MyCalledAssembly" publicKeyToken="0556152c9715d60f" /> <bindingRedirect oldVersion="" newVersion="" /> </dependentAssembly> </assemblyBinding> </runtime> </configuration>
This file tells the CLR to load version of MyCalledAssembly, even though the application was compiled to use version In the publisher policy resolution stage, the CLR checks for a policy file distributed with the assembly itself. A publisher policy file starts as an XML file, very similar to an application configuration file:

<configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity

name="MyCalledAssembly" publicKeyToken="0556152c9715d60f" /> <bindingRedirect oldVersion="" newVersion=""/> </dependentAssembly> </assemblyBinding> </runtime> </configuration>
This particular publisher policy file tells the CLR to use version of the assembly to satisfy requests for version Before a publisher policy file can take effect, it must be compiled, using the al.exe tool:

al /link:policy.1.0.MyCalledAssembly.config /out:policy.1.0.MyCalledAssembly.dll /keyfile:..\..\MyKeyFile.snk
The compiled publisher policy file can then be installed in the GAC along with the assembly to which it refers. Publisher policy files generally override application policy. However, you can force your application to ignore a publisher policy file by adding <publisherPolicy apply="no"/> to your application configuration file. The final stage in applying policy rules is administrator policy resolution. The administrator of a computer can specify system-wide binding rules in the machine.config file. These rules override both application policy and publisher policy. These machine.config settings use the same format as the binding policy settings in the application configuration file.

GUI to the Rescue
You can configure both application binding policy and administrator binding policy without editing XML files by hand. These are among the tasks that the .NET Framework Configuration tool (which you can launch from Start-Programs- Administrative Tools) can perform. If you launch this tool and expand the tree, as shown in FIgure 1, you'll find two places where you can work with configured assemblies.

Figure 1: Working with configured assemblies in the Microsoft .NET Framework Configuration Tool

A configured assembly is simply one that has an explicit binding policy. The Configured Assemblies node directly beneath My Computer lets you set administrator policy for configured assemblies. The other nodes, beneath specific applications, let you set application policy for configured assemblies. To work with a specific application in this tool, click on the Applications node, select "Add an Application to Configure," and follow the instructions to add your application to the tree. Whether you're working at the application or the administrator level, the procedures here are the same. Click on one of the Configured Assemblies nodes, and you'll have two choices. The first choice allows you to add a new configured assembly to the list for the application or computer. The second allows you to view a list of all configured assemblies. From the list, you can double-click an assembly to set its properties. Figure 2 shows the properties dialog box.

Figure 2: Setting policy properties for a configured assembly

The properties dialog box has three tabs:

• • •

The General tab shows the basic identifying information for the assembly and (for an application policy) allows you to override any publisher policy. The Binding Policy tab lets you specify the mapping between the version that an application requests and the version that the CLR actually delivers. The Codebases tab lets you tell the CLR where to find particular versions of the assembly.

Rules of Thumb
As with some other areas of the .NET Framework, binding resolution offers an overwhelming number of options. I'll leave you with some thoughts on ways in which you might choose to use these options.

• •

If you don't have any compelling reason to get into the process, don't. Just messing around for the sake of messing around won't help you out. Publisher policy files are for publishers. The best time to use them is when you're issuing a service pack for a component, and want to make sure that people get the benefits of your fixes. But remember, the application developer can still override your policy file. On the other hand, as an application developer, don't override publisher policy unless you've got a specific reason.

The prime reason for an application-level binding policy is to provide upgrades for an application that's already deployed. When a new version of a shared component comes out that your application can benefit from, an application policy file will allow you to provide these benefits without recompiling or replacing existing installations. Finally, as an administrator, you may make rare use of an administrator policy file to help keep a machine in a known good state. If there's a shared network component with a bug, for example, you can use an administrator policy file to make sure that the bug-fixed version is uniformly used by all of the .NET applications on the box.

Mike Gunderloy is the lead developer for Larkware and author of numerous books and articles on programming topics.

Serialization is the process of converting the state of an object into a form that can be persisted or transported. The complement of serialization is deserialization, which converts a stream into an object. Together, these processes allow data to be easily stored and transferred. The .NET Framework features two serializing technologies:

Binary serialization preserves type fidelity, which is useful for preserving the state of an object between different invocations of an application. For example, you can share an object between different applications by serializing it to the Clipboard. You can serialize an object to a stream, to a disk, to memory, over the network, and so forth. Remoting uses serialization to pass objects "by value" from one computer or application domain to another.

XML serialization serializes only public properties and fields and does not preserve type fidelity. This is useful when you want to provide or consume data without restricting the application that uses the data. Because XML is an open standard, it is an attractive choice for sharing data across the Web. SOAP is likewise an open standard, which makes it an attractive choice.
.NET Framework Developer's Guide

Binary Serialization
The easiest way to make a class serializable is to mark it with the Serializable attribute as follows. C# Copy Code

[Serializable] public class public int MyOb jec t { n1 = 0 ;


n2 = 0 ; public int St r i ng s t r = ; public null

The code example below shows how an instance of this class can be serialized to a file. C# Copy Code

MyObject obj = new MyObject(); obj.n1 = 1; obj.n2 = 24; obj.str = "Some String"; IFormatter formatter = new BinaryFormatter(); Stream stream = new FileStream("MyFile.bin", FileMode.Create, FileAccess.Write, FileShare.None); formatter.Serialize(stream, obj);
This example uses a binary formatter to do the serialization. All you need to do is create an instance of the stream and the formatter you intend to use, and then call the Serialize method on the formatter. The stream and the object to serialize are provided as parameters to this call. Although it is not explicitly demonstrated in this example, all member variables of a class will be serialized— even variables marked as private. In this aspect, binary serialization differs from the XMLSerializer Class, which only serializes public fields. For information on excluding member variables from binary serialization, see Selective Serialization. Restoring the object back to its former state is just as easy. First, create a stream for reading and a formatter, and then instruct the formatter to deserialize the object. The code example below shows how this is done. C# Copy Code


IFormatter formatter = new BinaryFormatter(); Stream stream = new FileStream("MyFile.bin", FileMode.Open, FileAccess.Read, FileShare.Read); MyObject obj = (MyObject) formatter.Deserialize(stream); stream.Close(); // Here's the proof. Console.WriteLine("n1: {0}", obj.n1); Console.WriteLine("n2: {0}", obj.n2);
The BinaryFormatter used above is very efficient and produces a compact byte stream. All objects serialized with this formatter can also be deserialized with it, which makes it an ideal tool for serializing objects that will be deserialized on the .NET Framework. It is important to note that constructors are not called when an object is deserialized. This constraint is placed on deserialization for performance reasons. However, this violates some of the usual contracts the runtime makes with the object writer, and developers should ensure that they understand the ramifications when marking an object as serializable.

Console.WriteLine("str: {0}", obj.str);

If portability is a requirement, use the SoapFormatter instead. Simply replace the BinaryFormatter in the code above with SoapFormatter, and call Serialize and Deserialize as before. This formatter produces the following output for the example used above. Copy Code

<SOAP-ENV:Envelope xmlns:xsi="" xmlns:xsd="" xmlns:SOAP- ENC="" xmlns:SOAP- ENV="" SOAP-ENV:encodingStyle= "" "" xmlns:a1=""> <SOAP-ENV:Body> <a1:MyObject id="ref-1"> <n1>1</n1> <n2>24</n2> <str id="ref-3">Some String</str> </a1:MyObject> </SOAP-ENV:Body>
It is important to note that the Serializable attribute cannot be inherited. If you derive a new class from MyObject, the new class must be marked with the attribute as well, or it cannot be serialized. For example, when you attempt to serialize an instance of the class below, you will get a SerializationException informing you that the MyStuff type is not marked as serializable. C# Copy Code


public class MyStuff : MyObject { public int n3;
Using the Serializable attribute is convenient, but it has limitations as demonstrated above. Refer to the Serialization Guidelines for information about when you should mark a class for serialization; serialization cannot be added to a class after it has been compiled.


Selective Serialization
A class often contains fields that should not be serialized. For example, assume a class stores a thread ID in a member variable. When the class is deserialized, the thread stored the ID for when the class was serialized might no longer be running; so serializing this value does not make sense.

You can prevent member variables from being serialized by marking them with the NonSerialized attribute as follows. C# Copy Code

[Serializable] public class MyObject { public int n1; [NonSerialized] public int n2; public String str;
If possible, make an object that could contain security-sensitive data nonserializable. If the object must be serialized, apply the NonSerialized attribute to specific fields that store sensitive data. If you do not exclude these fields from serialization, be aware that the data they store will be exposed to any code that has permission to serialize. For more information about writing secure serialization code, see Security and Serialization.


XML and SOAP Serialization
XML serialization serializes only the public fields and property values of an object into an XML stream. XML serialization does not include type information. For example, if you have a Book object that exists in the Library namespace, there is no guarantee that it will be deserialized into an object of the same type.

To serialize an object
1. Create the object and set its public fields and properties. Construct a XmlSerializer using the type of the object. For more information, see the XmlSerializer class constructors.

2. 3.

Call the Serialize method to generate either an XML stream or a file representation of the object's public properties and fields. The following example creates a file. 4. C#

Copy Code

MySerializableClass myObject = new MySerializableClass(); // Insert code to set properties and fields of the object. XmlSerializer mySerializer = new XmlSerializer(typeof(MySerializableClass)); // To write to a file, create a StreamWriter object. StreamWriter myWriter = new StreamWriter("myFileName.xml"); mySerializer.Serialize(myWriter, myObject); myWriter.Close();

Sign up to vote on this title
UsefulNot useful