Q1. What is the difference between weakly typed and strongly typed languages?

How a compiler handles Type Checking and Type Conversion? Ans :- A strongly typed language does not allow you to use one type as another. In C you can pass a data element of the wrong type and it will not complain. Strong typing probably means that variables have a well-defined type and that there are strict rules about combining variables of different types in expressions. For example, if A is an integer and B is a float, then the strict rule about A+B might be that A is cast to a float and the result returned as a float. If A is an integer and B is a string, then the strict rule might be that A+B is not valid. Strongly typed means, a will not be automatically converted from one type to another. Weak typing implies that the compiler does not enforce a typing discipline, or perhaps that enforcement can easily be subverted. Weakly typed is the opposite: Perl can use a string like "123" in a numeric context, by automatically converting it into the int 123. Differences between Strongly Typed and Weakly Typed Languages:1. A language is strongly typed if type annotations are associated with variable names, rather than with values. If types are attached to values, it is weakly typed. 2. A language is strongly typed if it contains compile-time checks for type constraint violations. If checking is deferred to run time, it is weakly typed. 3. A language is strongly typed if there are compile-time or run-time checks for type constraint violations. If no checking is done, it is weakly typed. 4. A language is strongly typed if conversions between different types are forbidden. If such conversions are allowed, it is weakly typed. 5. A language is strongly typed if conversions between different types must be indicated explicitly. If implicit conversions are performed, it is weakly typed. 6. A language is strongly typed if there is no language-level way to disable or evade the type system. If there are casts or other type-evasive mechanisms, it is weakly typed. 7. A language is strongly typed if it has a complex, fine-grained type system with compound types. If it has only a few types, or only scalar types, it is weakly typed. 8. A language is strongly typed if the type of its data objects is fixed and does not vary over the lifetime of the object. If the type of a datum can change, the language is weakly typed.

Weak versus strong
The main difference, roughly speaking, between a strongly typed language and a weakly typed one is that a weakly typed one makes conversions between unrelated types implicitly, while a strongly typed one typically disallows implicit conversions between unrelated types.

Dynamically typed languages offer tremendous flexibility when you structure your code. but also supports compile-time type checking with a special compiler mode called strict mode. such as C++ and Java. but run-time type checking results in a run-time error. data type flexibility usually becomes less important than catching type errors as early as possible. . but at the cost of allowing type errors to manifest at run time.g. while a weakly typed one would carry out the conversion regardless. In Java. but in standard mode. Handling of Type checking and type conversion in a compiler Type checking Type checking can occur at either compile time or run time. the variables are declared as int. type checking occurs only at run time. Statically typed languages report type errors at compile time.0 has run-time type checking.g. type checking occurs at both compile time and run time. As a dynamically typed language. In strict mode. Compile-time type checking Compile-time type checking is often favored in larger projects because as the size of a project grows. In strict mode. In Visual Basic any variable is declared as "Dim a as Var" Strongly typed means that the programmer provides the data type while declaring the variables. e.Furthermore a strongly typed language requires an explicit conversion (by using the cast operator) between related types. when there is possibility of data loss. Run-time type checking Run-time type checking occurs in ActionScript 3.0 whether you compile in strict mode or standard mode. e. do type checking at compile time. ActionScript 3. handle type checking at run time. the compiler will generate an error. Consider a situation in which the value 3 is passed as an argument to a function that expects an array. If you disable strict mode. because the value 3 is not compatible with the data type Array. the compiler does not complain about the type mismatch. but at the cost of requiring that type information be known at compile time. short and long etc. Weakly typed means that the programmer doesn't need to take care of data types while declaring variables. Dynamically typed languages. Statically typed languages. and run in standard mode. such as Smalltalk and Python.

casting actually converts values from one data type to another. if the value 2 is assigned to a variable of the Boolean data type. If an implicit conversion is unsuccessful. objB = arr. Explicit conversion. the following code takes a Boolean value and casts it to an integer: var myBoolean:Boolean = true.Type conversions A type conversion is said to occur when a value is transformed into a value of a different data type. // 1 Implicit conversions Implicit conversions happen at run time in a number of contexts: • • • • In assignment statements When values are passed as function arguments When values are returned from functions In expressions using certain operators. which is also called casting. such as the addition (+) operator For user-defined types. objA = objB. When primitive values are involved. // Conversion succeeds. . var objB:B = new B(). you wrap the object name in parentheses and precede it with the name of the new type. occurs when your code instructs the compiler to treat a variable of one data type as if it belongs to a different data type. implicit conversions are handled by calling the same internal conversion algorithms that are called by the explicit conversion functions. For example. Implicit conversion. // Conversion fails. the value 2 is converted to the Boolean value true before assigning the value to the variable. is sometimes performed at run time. For example. var arr:Array = new Array(). To cast an object to a different type. var myINT:int = int(myBoolean). For example. Type conversions can be either implicit or explicit. an error occurs. For primitive types. which is also called coercion. implicit conversions succeed when the value to be converted is an instance of the destination class or a class that derives from the destination class. trace(myINT). the following code contains a successful implicit conversion and an unsuccessful implicit conversion: class A {} class B extends A {} var objA:A = new A().

when you compile in strict mode. For example. Q2. usually in an intermediate language such as three address code. Debug data generation if required so the code can be debugged. The following code generates a compile-time error even though the code would run correctly in standard mode: var quantityField:String = "3". Further stages of compilation may or may not be referred to as "code generation". This may be the case when you know that coercion will convert your values correctly at run time. // compile time error in strict mode If you want to continue using strict mode. or casting. because there may be times when you do not want a type mismatch to generate a compile-time error. when working with data received from a form.) Major tasks in code generation Tasks which are typically part of a sophisticated compiler's "code generation" phase include: • • • • Instruction selection: which instructions to use. a peephole optimization pass would not likely be called "code generation". but would like the string converted to an integer. as follows: var quantityField:String = "3". // Explicit conversion succeeds.g. Scheduling is a speed optimization that can have a critical effect on pipelined machines. .Code generation is the process by which a compiler’s code generator converts some intermediate representation of source code into a form (e. Register allocation: the allocation of variables to processor registers. The input to the code generator typically consists of a parse tree or an abstract syntax tree. var quantity:int = quantityField. depending on whether they involve a significant change in the representation of the program.. Instruction scheduling: in which order to put those instructions. The tree is converted into a linear sequence of instructions. although a code generator might incorporate a peephole optimization pass. var quantity:int = int(quantityField). (For example. Discuss the process of code generation? Also elaborate on the problem faced in this process? Ans:. you may want to rely on coercion to convert certain string values to numeric values.Explicit conversions It’s helpful to use explicit conversions. you can use explicit conversion. machine code) that can be readily executed by a machine (often a computer).

a nondetermistic finite state machine is often generated instead of a deterministic one. instruction selection. However. there may be two instruction selection stages — one to convert the parse tree into intermediate code. it is important that the entire process be efficient with respect to space and time. meaning that it should make effective use of the resources of the target machine. JIT code generation can take advantage of profiling information that is available only at runtime. For example.In a compiler that uses an intermediate language. it can be done linearly. together with information in the symbol table that is used to determine the run time addresses of the data objects denoted by the names in the intermediate representation. and evaluation order are inherent in almost all code generation problems. It takes as input an intermediate representation of the source program and produces as output an equivalent target program. when regular expressions are interpreted and used to generate code at runtime. CODE GENERATION The final phase in our compiler model is the code generator. because usually the former can be created more quickly and occupies less memory space than the latter. This second phase does not require a tree traversal. The output code must be correct and of high quality. issues such as memory management. Moreover. as in just-in-time compilation(JIT). and a second phase much later to convert the intermediate code into instructions from the instruction set of the target machine. three address representations such as quadruples. virtual machine representations such as syntax trees and dags. register allocation. Input to the code Generator The input to the code generator consists of the intermediate representation of the source program produced by the front end. ISSUES IN THE DESIGN OF A CODE GENERATOR While the details are dependent on the target language and the operating system. Despite its generally generating less efficient code. and typically involves a simple replacement of intermediate-language operations with their corresponding opcodes. including: linear representations such as postfix notation. the code generator itself should run efficiently. The requirements traditionally imposed on a code generator are severe. . There are several choices for the intermediate language. if the compiler is actually a language translator Runtime code generation When code generation occurs at runtime.

If the target machine does not support each data type in a uniform manner. Memory Management Mapping names in the source program to addresses of data objects in run time memory is done cooperatively by the front end and the code generator. . The use of registers is often subdivided into two subproblems: 1. The output may take on a variety of forms: absolute machine language. Our techniques optimize the representation of both the programmerdefined fields within each object and the header information used by the run-time system: Field Reduction: The compiler then transforms the program to reduce the size of the field to the smallest type capable of storing that range of values.Target Programs The output of the code generator is the target program. During register allocation. 2. The uniformity and completeness of the instruction set are important factors. efficient utilization of register is particularly important in generating good code. A small program can be compiled and executed quickly. then each exception to the general rule requires special handling. we select the set of variables that will reside in registers at a point in the program. During a subsequent register assignment phase. relocatable machine language. Therefore. or assembly language. Producing an absolute machine language program as output has the advantage that it can be placed in a location in memory and immediately executed. we pick the specific register that a variable will reside in. We assume that a name in a three-address statement refers to a symbol table entry for the name. Register Allocation Instructions involving register operands are usually shorter and faster than those involving operands in memory. Instruction Selection The nature of the instruction set of the target machine determines the difficulty of instruction selection. Discuss the techniques that can be used to reduce the size or running time of a program? Techniques for reducing the amount of data space required to represent objects in objectoriented programs. Q 3.

Define a regular expression? How can it be converted into Finte Automata? Explain with the help of example. a regular expression provides a concise and flexible means to "match" (specify and recognize) strings of text. which is a program that either serves as a parser generator or examines text and identifies parts that match the provided specification. Our compiler uses the results of the analysis to replace the reference with a smaller set into a table of pointers to the class data. and programming languages to search and manipulate text based on patterns. words. and replaces each read with the constant value. and may be supported within some applications for the purpose of providing backward compatibility. It then removes these fields from their enclosing class. respectively Simple Regular Expressions Simple Regular Expressions is a syntax that may be used by historical versions of application programs. Static Specialization: Our analysis finds classes with fields whose values do not change after initialization. Our byte packing algorithm arranges the fields in the object to minimize the object size. even though different instances of the object may have different values for these fields.A regular expression is a set of pattern matching rules encoded in a string according to certain syntax rules. utilities. Regular expressions consist of constants and operators that denote sets of strings and operations over these sets. substituting accessor methods which return constant values. A regular expression is written in a formal language that can be interpreted by a regular expression processor. such as inheritance information and method dispatch tables. It replaces reads with hash table lookups. Fields without executable reads are also removed. Regular expressions are used by many text editors. in particular the editor ed and the filter grep.Unread and Constant Field Elimination: If the bit width analysis finds that a field always holds the same constant value. Q4. using a hash table to store only values of the field that differ from the default value. which contains a pointer to the class data for that object. the lookup simply returns the default value. Ans:. It removes each write to the field. It replaces writes to the field with an insertion into the hash table (if the written value is not the default value) or a removal from the hash table (if the written value is the default value). if the object is not present in the hash table. Byte Packing: All of the above transformations may reduce or eliminate the amount of space required to store each field in the object or object header. . Class Pointer Compression: We use rapid type analysis to compute an upper bound on the number of classes that the program may instantiate commonly called claz. Abbreviations for "regular expression" include "regex" and "regexp". or patterns of characters. The concept of regular expressions was first popularized by utilities provided by Unix distributions. It then generates specialized versions of each class which omit these fields. Field Externalization: Our analysis uses profiling to find fields that almost always have the same default value. such as particular characters. the compiler eliminates the field.

the NFA succeeds if at least one of these choices succeeds.Converting a Regular Expression into a Deterministic Finite Automaton The task of a scanner generator. The transition doesn't consume any input characters. so you may jump to another state for free. An NFA is similar to a DFA but it also permits multiple transitions over the same character and transitions over . So it needs to convert REs into a single DFA. we construct the NFAs for A and B. This is accomplished in two steps: first it converts REs into a non-deterministic finite automaton (NFA) and then it converts the NFA into a DFA. For example. such as JLex. In the case of multiple transitions from a state over the same character. the RE (a| b)c is mapped to the following NFA: . is to generate the transition tables or to synthesize the scanner program given a scanner specification (in the form of a set of REs). Then the NFA for AB is constructed by connecting the final state of A to the start state of B using an empty transition. which are represented as two boxes with one start state and one final state for each box. The problem is that when converting a NFA to a DFA we may get an exponential blowup in the number of states. We will first learn how to convert a RE into a NFA. the third rule indicates that. the above rules construct NFAs with only one final state. This is the easy part. when we are at this state and we read this character. to construct the NFA for the RE AB. There are only 5 rules. we have more than one choice. But it turns out that DFAs and NFAs have the same expressive power. Clearly DFAs are a subset of NFAs. one for each type of RE: As it can been shown inductively. For example.

s2. (Recall that a particular input sequence when parsed by a DFA. node {1. For every DFA state labeled by some set {s1. For example.. 2} goes to the error node which is associated with an empty set of NFA nodes. The following NFA recognizes (a| b)*(abb | a+b). The closure of node 1 in the left figure below is the set {1. or more transitions. while when parsed by a NFA it may lead to multiple states. This indicates that arriving to the state labeled {5.) First we need to handle transitions that lead to other states for free (without consuming any input). Suppose that you assign a number to each NFA state. The start state of the constructed DFA is labeled by the closure of the NFA start state. 5} for the character a since the NFA node 3 can be reached by 1 on a and nodes 4 and 5 can be reached by 2. sn} and for every character c in the language alphabet. If this set is not the label of any other node in the DFA constructed so far. even though it wasn't constructed with the above RE-to-NFA rules.The next step is to convert a NFA to a DFA (called subset construction). 8} in the DFA is the same as arriving to the state 5. leads to a unique state. the state 6. one. For example. you create a new DFA node with this label.. For example. 8}. We define the closure of a NFA node as the set of all the nodes reachable by this node using zero. The b arrow for node {1. instead of just one number. or sn using c arrows and you union together the closures of these nodes. 2}. or the state 8 in the NFA when parsing the same input.. 6.. 4. a DFA state may have been assigned the set {5. It has the following DFA: . 2} in the DFA above has an arrow to a {3. The DFA states generated by subset construction have sets of numbers. 6. These are the transitions... .. you find all the states reachable by s1.

In order to manage memory more efficiently and with fewer errors. disk files. How is memory managed at the time of execution of a program? Discuss with Reference to any language of your choice? Ans : Processes in a system share the CPU and main memory with other processes. (2) It simplifies memory management by providing each process with a uniform address space. Physical and Virtual Addressing The main memory of a computer system is organized as an array of M contiguous bytesized cells. Virtual memory is an elegant interaction of hardware exceptions. and transferring data back and forth between disk and memory as needed. and private address space. it generates an effective physical address and passes it to main memory over the memory bus. The main memory fetches the 4-byte word starting at physical address 4 and returns it to the CPU. hardware address translation. the next byte an address of 1. Address Space Modern processors use a form of addressing known as virtual addressing. which . and so on. When the CPU executes the load instruction. main memory. (1) It uses main memory efficiently by treating it as a cache for an address space stored on disk. The first byte has an address of 0. the CPU accesses main memory by generating a virtual address (VA). Each byte has a unique physical address (PA). With virtual addressing. and kernel software that provides each process with a large. keeping only the active areas in main memory. the next byte an address of 2. virtual memory provides three important capabilities.Q 5. which stores it in a register. modern systems provide an abstraction of main memory known as virtual memory (VM). We call this approach physical addressing. uniform. (3) It protects the address space of each process from corruption by other processes.

Accessing arrays with indexes that are out of bounds. Neglecting to free heap-allocated objects when they are no longer required. Discuss the following Terms: a) Polymorphic Functions: Polymorphism is a programming language feature that allows values of different datatypes to be handled using a uniform interface. Consider the type of the function sumlist that takes a list of numbers and adds them up. it is said to be polymorphic which means the same operation applied to different . Unexpected sharing due to insufficient copying by copy constructors. Excessive copying by copy constructors. An alternative solution to copying is using "smart" pointer classes. Like exception handling. including: • • • • • • • Using stack-allocated structures beyond their lifetimes Using heap-allocated structures after freeing them. and then for list made by ``:'' which will be sum the tail of the list recursively and then add the value of the head element: sumlist [] = 0 sumlist (h:t) = h + sumlist t the type of this functions is: [Int]->Int. because the argument is a list the elements of which are added and the sum is the result. Q6. Allocating insufficient memory for the intended contents.since it's dangerous to point to an object which can die before we're done with it. which could emulate automatic memory management by maintaining reference count. Polymorphic functions are functions whose operands (actual parameters) can have more than one type.is converted to the appropriate physical address before being sent to the memory. it just takes the list apart and counts its way along. The task of converting a virtual address to a physical one is known as address translation. Using recursive functions this must produce an answer for the empty list. the value is zero. C++ manual memory management is inherited from C without changes Manual memory management is incompatible with features such as exceptions & operator overloading. In fact len can be applied to any list. It does not perform any operation on the elements of the lists. The most common solution is copying . The standard library functions for memory management in C++ are new and delete. Dedicated hardware on the CPU chip called the memory management unit (MMU) translates virtual addresses on the fly. The language is notorious for fostering large numbers of memory management bugs. address translation requires close cooperation between the CPU hardware and the operating system. Management of Memory in C++ Language:C++ is a (weakly) object-oriented language. using a look-up table stored in main memory whose contents are managed by the operating system.

 represent initial values of names. A rooted tree is a special kind of DAG and a DAG is a special kind of directed graph. . .interior node ≡ computed values . ``c'' etc. | a b | f | ^ v | | |--<---DAG expression: a*b+f(a*b) The DAG Representation of Basic Blocks Directed acyclic graphs (DAGs) give a picture of how the value computed by each statement in the basic block is used in the subsequent statements of the block.Nodes are also (optionally) given a sequence of identifiers for labels. The way Haskell types a polymorphic function is to use a type variable where the ``any'' type would be. . *<---| () . . Definition: a dag for a basic block is a directed acyclic graph with the following labels on nodes: ... . () . . those which can be obtained by substituting a type for the variable uniformly throughout the signature. So len has types: [Int] -> Int [Char] -> Int [Bool] -> Int [[Char]] -> Int [[Int]] -> Int . a b f * . a polytype.types of value. and infinitely many more. a DAG may be used to represent common subexpressions in an optimising compiler..leaves are labeled with either variable names or constants. .or r-value. . + . . . | .. Subscript with 0. . . so len is of type: len:: [a] -> Int Type variables are ``a''. b) DAG Representation of Program A directed acyclic graph (DAG!) is a directed graph that contains no cycles. | . . . A letter type variable in a type means the function has a range of types.interior nodes are labeled by an operator symbol. ``b''. . a b Tree . and they can only be used in type descriptions. * . . .. For example. + . .  they are unique identifiers  from operators we determine whether l.

t3 i0 (1) <= + t7. i 1 20 . Example of DAG Representation • • • • • • • • • • t1:= 4*i t2:= a[t1] t3:= 4*i t4:= b[t3] t5:= t2 * t4 t6:= prod + t5 prod:= t6 t7:= i + 1 i:= t7 if i <= 20 goto 1 Three address code + prod [] a * t2 b 4 Corresponding DAG t5 [] t4 * t1..identifiers in the sequence – have that value.

Sign up to vote on this title
UsefulNot useful