• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
 An Overview of MSIL
The .NET architecture addresses an important need - language interoperability. Instead of generatingnative code that is specific to one platform, programming languages can generate code in MSIL(Microsoft Intermediate Language) targeting the Common Language Runtime (CLR) to reap the rich benefits provided by .NET.Advanced programmers occasionally peek into MSIL code when they are in doubt of what ishappening under the hood (using the
Ildasm
tool). Therefore, it is essential that the C# programmer understands the basics of MSIL. This beginner-level article gives an overview of MSIL and debuggingwith the Ildasm tool.
System Requirements
The programming examples in this article use C# as the source language for generating MSIL code,and so the reader is expected to have some basic understanding of C#. No prior exposure to MSIL isnecessary. In addition, the reader is assumed to know what a stack data structure is. It is preferable thatthe reader has access to the Ildasm tool and the C# compiler.
 Article Structure
The article has three main sections:
An Overview of MSIL: The basics of MSIL, the data types, instruction types, and the way thatthe instructions are executed are explained.
Examining MSIL: This section covers MSIL using simple example programs.
Debugging Using the Ildasm tool: Explains the use of the intermediate language disassembler (Ildasm) and the way it can be used for debugging..NET supports several high-level languages such as C#, VB.NET and Managed C++.NET. The MSIL isdesigned to accommodate a wide range of languages. In .NET, the unit of deployment is the PE(Portable Executable) file - a predefined binary standard (similar to the class files of Java). MSIL,along with metadata, is stored inside the PE files generated by the compiler. MSIL is such a simplelanguage that it doesn't require much effort to understand. Metadata describes the types - its definition,signature, etc - that are useful at runtime.
An Overview of MSIL
MSIL is a CPU independent, stack-based instruction set that can be efficiently converted to the nativecode of a specific platform. In this stack-based approach, the representation assumes the presence of arun-time stack and the code is generated keeping the stack in mind. The runtime environment may usethe stack for evaluation of expressions, and store the intermediate values in the stack itself. Such anevaluation using a runtime stack is a form of interpretation. In practice, the MSIL is not interpreted -there is a Just-In-Time (JIT) compiler that translates the intermediate code to native code to execute ina particular platform at runtime. The stack-based code facilitates maximum portability across the platforms and is easy to verify.The MSIL:
 
Supports object oriented programming.
Works in terms of the data types available in the .NET Framework, for example, System.Stringand System.Int32.
Instructions can be classified into various types such as: loading (ld*), storing (st*), methodinvocation, arithmetic operations, logical operations, control flow, memory allocation, andexception handling. The following section covers basic instructions using examples.
Examining MSIL
Let us start with the following simple C# code, and see how it is compiled to intermediate code.
Console.WriteLine("hello world");
The MSIL code looks like this (using the Ildasm tool that is discussed later).
// disassembled code using ildasm toolldstr "hello world"call void [mscorlib]System.Console::WriteLine(string)
 Now let us examine how it works:The ldstr (standing for 'load string') instruction indicates that the string constant "hello world" be pushed onto the evaluation stack.
The call instruction is for calling a method. Here, the call is made for the static WriteLinemethod of the Console class that is available in mscorlib.dll, in the System namespace. TheWriteLine method takes a string as the argument and its return type is void.It executes as follows:
The ldstr instruction pushes the reference to the constant "hello world" into the stack.
The call method calls the WriteLine method, which looks for a string argument, and pops itfrom the stack. Now the stack contains nothing. The WriteLine method now executes to printthe message " hello world " on the screen and returns.As you can see, understanding the MSIL code is far from difficult! If you have prior exposure to anyassembly language, it will be very easy for you to learn MSIL.From this simple program, let us move on to a program illustrating branching and arithmeticinstructions.
// C# source codeint i = 10;if(i!=20)i = i*20;Console.WriteLine(i);// disassembled MSIL code using ildasm toolIL_0000: ldc.i4.s 10IL_0002: stloc.0IL_0003: ldloc.0IL_0004: ldc.i4.s 20IL_0006: beq.s IL_000dIL_0008: ldloc.0IL_0009: ldc.i4.s 20IL_000b: mulIL_000c: stloc.0IL_000d: ldloc.0
 
IL_000e: call void [mscorlib]System.Console::WriteLine(int32)
You can see that lots of MSIL code has been generated for this simple C# code, but it is simple onceyou understand what the instructions do. You can see that the instructions are preceded by IL_xxxx: -these are labels used so that it is possible to 'jump' from one part of the code to another.The ldc.i4.s (stands for 'load constant'.'four byte integer'.'single byte argument') instruction pushes theinteger constant 10 onto the stack.The stloc.0 (stands for 'store in location'.'zeroeth variable') instruction pops the integer constant 10 fromthe stack and stores it in the variable number 0 (local variables are remembered by counting them from0).The ldloc.0 (stands for 'load from location'.'zeroeth variable') instruction loads the value of the variablefrom location zero (i.e. variable
i
in the source code) and push it onto the stack.The ldc.i4.s instruction pushes the integer constant 20 onto the stack.The beq.s (stands for 'branch if equal to'.' single byte argument') instruction pops two items from thestack and checks if they are equal and if so, it transfers the control to the instruction at the locationidentified by the label IL_000d.The ldloc.0 instruction pushes the value of variable i onto the stack.The ldc.i4.s instruction pushes the integer constant 0 onto the stack.The mul (stands for 'multiply') instruction pops two items from the stack, multiplies the values, and pushes the result back to the stack. Now the result of the multiplication is at the top of the stack.The stloc.0 instruction pops the top value from the stack (the result of the multiplication in this case)and stores it in variable i.The ldloc.0 instruction pushes the value of i onto the stack The call (stands for 'call the method') instruction calls the WriteLine method that takes an integer as anargument. The WriteLine method pops the value from the stack and displays it on the screen.
Debugging Using ILDASM Tool
Microsoft's .NET SDK is shipped with an IL disassembler, Ildasm.exe (usually located in thedirectory \Program Files\Microsoft.Net\FrameworkSDK\Bin). A disassembler loads your assembliesand shows the MSIL code with other details in the assembly.This tool can be handy in debugging code once you become proficient at understanding MSIL code.How can MSIL help in debugging?Bugs happen in code when there is a mismatch between what we expect the code to do and what thecode actually does. If we can dig down to a lower level and see what the machine is actually doing withour code, it is easier to spot the mismatch. That is the idea behind using ILDASM for debugging. Let uslook at an example and see how we can debug the code. The following innocent looking code doesn'twork as you'd expect. It doesn't print " yes, o1 == o2 " as we'd expect, even though the code isstraightforward.
int i = 10;object o1 = i, o2 = i;if(o1 == o2)Console.WriteLine("yes, o1 == o2");
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...