You are on page 1of 13

Q1. What is the difference between weakly typed and strongly typed languages?

How a compiler handles Type Checking and Type Conversion? Ans :- A strongly typed language does not allow you to use one type as another. In C you can pass a data element of the wrong type and it will not complain. Strong typing probably means that variables have a well-defined type and that there are strict rules about combining variables of different types in expressions. For example, if A is an integer and B is a float, then the strict rule about A+B might be that A is cast to a float and the result returned as a float. If A is an integer and B is a string, then the strict rule might be that A+B is not valid. Strongly typed means, a will not be automatically converted from one type to another. Weak typing implies that the compiler does not enforce a typing discipline, or perhaps that enforcement can easily be subverted. Weakly typed is the opposite: Perl can use a string like "123" in a numeric context, by automatically converting it into the int 123. Differences between Strongly Typed and Weakly Typed Languages:1. A language is strongly typed if type annotations are associated with variable names, rather than with values. If types are attached to values, it is weakly typed. 2. A language is strongly typed if it contains compile-time checks for type constraint violations. If checking is deferred to run time, it is weakly typed. 3. A language is strongly typed if there are compile-time or run-time checks for type constraint violations. If no checking is done, it is weakly typed. 4. A language is strongly typed if conversions between different types are forbidden. If such conversions are allowed, it is weakly typed. 5. A language is strongly typed if conversions between different types must be indicated explicitly. If implicit conversions are performed, it is weakly typed. 6. A language is strongly typed if there is no language-level way to disable or evade the type system. If there are casts or other type-evasive mechanisms, it is weakly typed. 7. A language is strongly typed if it has a complex, fine-grained type system with compound types. If it has only a few types, or only scalar types, it is weakly typed. 8. A language is strongly typed if the type of its data objects is fixed and does not vary over the lifetime of the object. If the type of a datum can change, the language is weakly typed.

Weak versus strong
The main difference, roughly speaking, between a strongly typed language and a weakly typed one is that a weakly typed one makes conversions between unrelated types implicitly, while a strongly typed one typically disallows implicit conversions between unrelated types.

In Visual Basic any variable is declared as "Dim a as Var" Strongly typed means that the programmer provides the data type while declaring the variables.g. but at the cost of allowing type errors to manifest at run time. ActionScript 3. Statically typed languages report type errors at compile time. because the value 3 is not compatible with the data type Array. short and long etc. e. Consider a situation in which the value 3 is passed as an argument to a function that expects an array. but also supports compile-time type checking with a special compiler mode called strict mode. the compiler will generate an error. but at the cost of requiring that type information be known at compile time. Handling of Type checking and type conversion in a compiler Type checking Type checking can occur at either compile time or run time. data type flexibility usually becomes less important than catching type errors as early as possible. Dynamically typed languages offer tremendous flexibility when you structure your code. In strict mode.Furthermore a strongly typed language requires an explicit conversion (by using the cast operator) between related types. .0 whether you compile in strict mode or standard mode. Weakly typed means that the programmer doesn't need to take care of data types while declaring variables. but run-time type checking results in a run-time error. but in standard mode. Dynamically typed languages. and run in standard mode. type checking occurs at both compile time and run time. the variables are declared as int. type checking occurs only at run time. If you disable strict mode. Run-time type checking Run-time type checking occurs in ActionScript 3. In strict mode.g. the compiler does not complain about the type mismatch. do type checking at compile time. while a weakly typed one would carry out the conversion regardless. such as C++ and Java. e. In Java. As a dynamically typed language. when there is possibility of data loss. Compile-time type checking Compile-time type checking is often favored in larger projects because as the size of a project grows. such as Smalltalk and Python.0 has run-time type checking. Statically typed languages. handle type checking at run time.

implicit conversions are handled by calling the same internal conversion algorithms that are called by the explicit conversion functions. Type conversions can be either implicit or explicit. var arr:Array = new Array(). such as the addition (+) operator For user-defined types. trace(myINT). you wrap the object name in parentheses and precede it with the name of the new type. For example. if the value 2 is assigned to a variable of the Boolean data type. an error occurs. . To cast an object to a different type. is sometimes performed at run time. Implicit conversion. which is also called coercion. occurs when your code instructs the compiler to treat a variable of one data type as if it belongs to a different data type. Explicit conversion. // 1 Implicit conversions Implicit conversions happen at run time in a number of contexts: • • • • In assignment statements When values are passed as function arguments When values are returned from functions In expressions using certain operators. casting actually converts values from one data type to another.Type conversions A type conversion is said to occur when a value is transformed into a value of a different data type. If an implicit conversion is unsuccessful. the value 2 is converted to the Boolean value true before assigning the value to the variable. which is also called casting. For example. // Conversion fails. implicit conversions succeed when the value to be converted is an instance of the destination class or a class that derives from the destination class. the following code contains a successful implicit conversion and an unsuccessful implicit conversion: class A {} class B extends A {} var objA:A = new A(). var objB:B = new B(). the following code takes a Boolean value and casts it to an integer: var myBoolean:Boolean = true. For example. // Conversion succeeds. objA = objB. var myINT:int = int(myBoolean). objB = arr. When primitive values are involved. For primitive types.

or casting. Instruction scheduling: in which order to put those instructions. (For example. This may be the case when you know that coercion will convert your values correctly at run time.) Major tasks in code generation Tasks which are typically part of a sophisticated compiler's "code generation" phase include: • • • • Instruction selection: which instructions to use.Code generation is the process by which a compiler’s code generator converts some intermediate representation of source code into a form (e.Explicit conversions It’s helpful to use explicit conversions. // Explicit conversion succeeds. when you compile in strict mode. // compile time error in strict mode If you want to continue using strict mode. For example.. var quantity:int = int(quantityField). The following code generates a compile-time error even though the code would run correctly in standard mode: var quantityField:String = "3". Debug data generation if required so the code can be debugged. Discuss the process of code generation? Also elaborate on the problem faced in this process? Ans:. depending on whether they involve a significant change in the representation of the program. because there may be times when you do not want a type mismatch to generate a compile-time error. you can use explicit conversion. . machine code) that can be readily executed by a machine (often a computer). Scheduling is a speed optimization that can have a critical effect on pipelined machines. Register allocation: the allocation of variables to processor registers. but would like the string converted to an integer. although a code generator might incorporate a peephole optimization pass. Further stages of compilation may or may not be referred to as "code generation". The input to the code generator typically consists of a parse tree or an abstract syntax tree. usually in an intermediate language such as three address code. The tree is converted into a linear sequence of instructions. you may want to rely on coercion to convert certain string values to numeric values. when working with data received from a form. Q2.g. a peephole optimization pass would not likely be called "code generation". as follows: var quantityField:String = "3". var quantity:int = quantityField.

meaning that it should make effective use of the resources of the target machine. because usually the former can be created more quickly and occupies less memory space than the latter. It takes as input an intermediate representation of the source program and produces as output an equivalent target program. a nondetermistic finite state machine is often generated instead of a deterministic one. there may be two instruction selection stages — one to convert the parse tree into intermediate code. as in just-in-time compilation(JIT). For example. together with information in the symbol table that is used to determine the run time addresses of the data objects denoted by the names in the intermediate representation.In a compiler that uses an intermediate language. JIT code generation can take advantage of profiling information that is available only at runtime. However. register allocation. Input to the code Generator The input to the code generator consists of the intermediate representation of the source program produced by the front end. if the compiler is actually a language translator Runtime code generation When code generation occurs at runtime. it is important that the entire process be efficient with respect to space and time. There are several choices for the intermediate language. This second phase does not require a tree traversal. Despite its generally generating less efficient code. CODE GENERATION The final phase in our compiler model is the code generator. it can be done linearly. ISSUES IN THE DESIGN OF A CODE GENERATOR While the details are dependent on the target language and the operating system. . the code generator itself should run efficiently. and typically involves a simple replacement of intermediate-language operations with their corresponding opcodes. three address representations such as quadruples. and a second phase much later to convert the intermediate code into instructions from the instruction set of the target machine. issues such as memory management. The requirements traditionally imposed on a code generator are severe. and evaluation order are inherent in almost all code generation problems. when regular expressions are interpreted and used to generate code at runtime. The output code must be correct and of high quality. Moreover. including: linear representations such as postfix notation. instruction selection. virtual machine representations such as syntax trees and dags.

We assume that a name in a three-address statement refers to a symbol table entry for the name. . Q 3. Therefore. Producing an absolute machine language program as output has the advantage that it can be placed in a location in memory and immediately executed. Our techniques optimize the representation of both the programmerdefined fields within each object and the header information used by the run-time system: Field Reduction: The compiler then transforms the program to reduce the size of the field to the smallest type capable of storing that range of values. A small program can be compiled and executed quickly. Instruction Selection The nature of the instruction set of the target machine determines the difficulty of instruction selection. During a subsequent register assignment phase. Discuss the techniques that can be used to reduce the size or running time of a program? Techniques for reducing the amount of data space required to represent objects in objectoriented programs. If the target machine does not support each data type in a uniform manner. we select the set of variables that will reside in registers at a point in the program. relocatable machine language. efficient utilization of register is particularly important in generating good code. Memory Management Mapping names in the source program to addresses of data objects in run time memory is done cooperatively by the front end and the code generator.Target Programs The output of the code generator is the target program. or assembly language. Register Allocation Instructions involving register operands are usually shorter and faster than those involving operands in memory. 2. The output may take on a variety of forms: absolute machine language. then each exception to the general rule requires special handling. The use of registers is often subdivided into two subproblems: 1. The uniformity and completeness of the instruction set are important factors. we pick the specific register that a variable will reside in. During register allocation.

utilities. a regular expression provides a concise and flexible means to "match" (specify and recognize) strings of text. Our byte packing algorithm arranges the fields in the object to minimize the object size.Unread and Constant Field Elimination: If the bit width analysis finds that a field always holds the same constant value. Define a regular expression? How can it be converted into Finte Automata? Explain with the help of example. using a hash table to store only values of the field that differ from the default value. Abbreviations for "regular expression" include "regex" and "regexp". Static Specialization: Our analysis finds classes with fields whose values do not change after initialization. and replaces each read with the constant value. even though different instances of the object may have different values for these fields. such as inheritance information and method dispatch tables. Field Externalization: Our analysis uses profiling to find fields that almost always have the same default value. Byte Packing: All of the above transformations may reduce or eliminate the amount of space required to store each field in the object or object header. Regular expressions consist of constants and operators that denote sets of strings and operations over these sets. Class Pointer Compression: We use rapid type analysis to compute an upper bound on the number of classes that the program may instantiate commonly called claz. which contains a pointer to the class data for that object. Ans:. in particular the editor ed and the filter grep. substituting accessor methods which return constant values. Fields without executable reads are also removed.A regular expression is a set of pattern matching rules encoded in a string according to certain syntax rules. the compiler eliminates the field. It then removes these fields from their enclosing class. It removes each write to the field. Q4. words. . Our compiler uses the results of the analysis to replace the reference with a smaller set into a table of pointers to the class data. and may be supported within some applications for the purpose of providing backward compatibility. which is a program that either serves as a parser generator or examines text and identifies parts that match the provided specification. It then generates specialized versions of each class which omit these fields. or patterns of characters. if the object is not present in the hash table. and programming languages to search and manipulate text based on patterns. It replaces writes to the field with an insertion into the hash table (if the written value is not the default value) or a removal from the hash table (if the written value is the default value). Regular expressions are used by many text editors. A regular expression is written in a formal language that can be interpreted by a regular expression processor. the lookup simply returns the default value. It replaces reads with hash table lookups. such as particular characters. respectively Simple Regular Expressions Simple Regular Expressions is a syntax that may be used by historical versions of application programs. The concept of regular expressions was first popularized by utilities provided by Unix distributions.

so you may jump to another state for free. one for each type of RE: As it can been shown inductively. An NFA is similar to a DFA but it also permits multiple transitions over the same character and transitions over . So it needs to convert REs into a single DFA. This is accomplished in two steps: first it converts REs into a non-deterministic finite automaton (NFA) and then it converts the NFA into a DFA. to construct the NFA for the RE AB.Converting a Regular Expression into a Deterministic Finite Automaton The task of a scanner generator. In the case of multiple transitions from a state over the same character. The problem is that when converting a NFA to a DFA we may get an exponential blowup in the number of states. But it turns out that DFAs and NFAs have the same expressive power. the RE (a| b)c is mapped to the following NFA: . Clearly DFAs are a subset of NFAs. We will first learn how to convert a RE into a NFA. the NFA succeeds if at least one of these choices succeeds. There are only 5 rules. the above rules construct NFAs with only one final state. the third rule indicates that. For example. when we are at this state and we read this character. we have more than one choice. Then the NFA for AB is constructed by connecting the final state of A to the start state of B using an empty transition. This is the easy part. is to generate the transition tables or to synthesize the scanner program given a scanner specification (in the form of a set of REs). which are represented as two boxes with one start state and one final state for each box. such as JLex. The transition doesn't consume any input characters. we construct the NFAs for A and B. For example.

The next step is to convert a NFA to a DFA (called subset construction)... 2} goes to the error node which is associated with an empty set of NFA nodes. leads to a unique state. sn} and for every character c in the language alphabet. while when parsed by a NFA it may lead to multiple states. a DFA state may have been assigned the set {5. 6.. We define the closure of a NFA node as the set of all the nodes reachable by this node using zero. or the state 8 in the NFA when parsing the same input. The DFA states generated by subset construction have sets of numbers. If this set is not the label of any other node in the DFA constructed so far. The b arrow for node {1. It has the following DFA: . For example. Suppose that you assign a number to each NFA state. The following NFA recognizes (a| b)*(abb | a+b). (Recall that a particular input sequence when parsed by a DFA. or more transitions. you create a new DFA node with this label. or sn using c arrows and you union together the closures of these nodes. The start state of the constructed DFA is labeled by the closure of the NFA start state.. The closure of node 1 in the left figure below is the set {1. 2}. 8}. 2} in the DFA above has an arrow to a {3.) First we need to handle transitions that lead to other states for free (without consuming any input). 6. the state 6. you find all the states reachable by s1. For example.. These are the transitions. one.. 8} in the DFA is the same as arriving to the state 5. . This indicates that arriving to the state labeled {5. node {1. For every DFA state labeled by some set {s1. For example. s2. 4. 5} for the character a since the NFA node 3 can be reached by 1 on a and nodes 4 and 5 can be reached by 2. even though it wasn't constructed with the above RE-to-NFA rules. instead of just one number..

The main memory fetches the 4-byte word starting at physical address 4 and returns it to the CPU. Each byte has a unique physical address (PA). virtual memory provides three important capabilities. hardware address translation. and private address space. Physical and Virtual Addressing The main memory of a computer system is organized as an array of M contiguous bytesized cells.Q 5. main memory. which . uniform. The first byte has an address of 0. When the CPU executes the load instruction. With virtual addressing. Address Space Modern processors use a form of addressing known as virtual addressing. modern systems provide an abstraction of main memory known as virtual memory (VM). Virtual memory is an elegant interaction of hardware exceptions. (1) It uses main memory efficiently by treating it as a cache for an address space stored on disk. which stores it in a register. the CPU accesses main memory by generating a virtual address (VA). keeping only the active areas in main memory. In order to manage memory more efficiently and with fewer errors. the next byte an address of 1. (3) It protects the address space of each process from corruption by other processes. and so on. How is memory managed at the time of execution of a program? Discuss with Reference to any language of your choice? Ans : Processes in a system share the CPU and main memory with other processes. (2) It simplifies memory management by providing each process with a uniform address space. We call this approach physical addressing. the next byte an address of 2. it generates an effective physical address and passes it to main memory over the memory bus. disk files. and kernel software that provides each process with a large. and transferring data back and forth between disk and memory as needed.

Dedicated hardware on the CPU chip called the memory management unit (MMU) translates virtual addresses on the fly. address translation requires close cooperation between the CPU hardware and the operating system. The language is notorious for fostering large numbers of memory management bugs. Like exception handling. it is said to be polymorphic which means the same operation applied to different . including: • • • • • • • Using stack-allocated structures beyond their lifetimes Using heap-allocated structures after freeing them. The most common solution is copying . Using recursive functions this must produce an answer for the empty list. because the argument is a list the elements of which are added and the sum is the result. C++ manual memory management is inherited from C without changes Manual memory management is incompatible with features such as exceptions & operator overloading. Accessing arrays with indexes that are out of bounds. Excessive copying by copy constructors.is converted to the appropriate physical address before being sent to the memory. Unexpected sharing due to insufficient copying by copy constructors. and then for list made by ``:'' which will be sum the tail of the list recursively and then add the value of the head element: sumlist [] = 0 sumlist (h:t) = h + sumlist t the type of this functions is: [Int]->Int. An alternative solution to copying is using "smart" pointer classes. Allocating insufficient memory for the intended contents. In fact len can be applied to any list. Management of Memory in C++ Language:C++ is a (weakly) object-oriented language. it just takes the list apart and counts its way along. The task of converting a virtual address to a physical one is known as address translation. The standard library functions for memory management in C++ are new and delete. It does not perform any operation on the elements of the lists.since it's dangerous to point to an object which can die before we're done with it. Q6. which could emulate automatic memory management by maintaining reference count. Consider the type of the function sumlist that takes a list of numbers and adds them up. using a look-up table stored in main memory whose contents are managed by the operating system. Discuss the following Terms: a) Polymorphic Functions: Polymorphism is a programming language feature that allows values of different datatypes to be handled using a uniform interface. Polymorphic functions are functions whose operands (actual parameters) can have more than one type. Neglecting to free heap-allocated objects when they are no longer required. the value is zero.

. . . and infinitely many more. () .  represent initial values of names. A rooted tree is a special kind of DAG and a DAG is a special kind of directed graph. .or r-value. . . .. . . * . those which can be obtained by substituting a type for the variable uniformly throughout the signature. ``b''. + . | a b | f | ^ v | | |--<---DAG expression: a*b+f(a*b) The DAG Representation of Basic Blocks Directed acyclic graphs (DAGs) give a picture of how the value computed by each statement in the basic block is used in the subsequent statements of the block..types of value. . Subscript with 0. . So len has types: [Int] -> Int [Char] -> Int [Bool] -> Int [[Char]] -> Int [[Int]] -> Int . . .interior nodes are labeled by an operator symbol. For example.. . . A letter type variable in a type means the function has a range of types. a b Tree . a b f * ..  they are unique identifiers  from operators we determine whether l. | . and they can only be used in type descriptions. . | . ``c'' etc. The way Haskell types a polymorphic function is to use a type variable where the ``any'' type would be. b) DAG Representation of Program A directed acyclic graph (DAG!) is a directed graph that contains no cycles.Nodes are also (optionally) given a sequence of identifiers for labels.interior node ≡ computed values . Definition: a dag for a basic block is a directed acyclic graph with the following labels on nodes: . a polytype. . so len is of type: len:: [a] -> Int Type variables are ``a''. *<---| () .leaves are labeled with either variable names or constants.. + . a DAG may be used to represent common subexpressions in an optimising compiler. .

i 1 20 .identifiers in the sequence – have that value. t3 i0 (1) <= + t7. Example of DAG Representation • • • • • • • • • • t1:= 4*i t2:= a[t1] t3:= 4*i t4:= b[t3] t5:= t2 * t4 t6:= prod + t5 prod:= t6 t7:= i + 1 i:= t7 if i <= 20 goto 1 Three address code + prod [] a * t2 b 4 Corresponding DAG t5 [] t4 * t1..