You are on page 1of 18

LANGUAGE PROCESSORS LAB

CASE STUDY

NAME- R JITENDRA

ROLL NO -1210314845

SECTION:3/4B8
1. Explain the steps to design a two pass macro processor.
A) A Macro represents a commonly used group of statements in the source programming language.

A macro instruction (macro) is a notational convenience for the programmer


o It allows the programmer to write shorthand version of a program (module programming)

The macro processor replaces each macro instruction with the corresponding group of source
language statements (expanding)
o Normally, it performs no analysis of the text it handles.
o It does not concern the meaning of the involved statements during macro expansion.

The design of a macro processor generally is machine independent!

Two new assembler directives are used in macro definition o MACRO: identify the beginning of a
macro definition o MEND: identify the end of a macro definition

Prototype for the macro o Each parameter begins with &


E.g. On SIC/XE requires a sequence of seven instructions to save the contents of all registers
Write one statement like SAVERGS.
A macro processor is not directly related to the architecture of the computer on which it is to run.
Macro processors can also be used with high-level programming languages, OS command
languages, etc.
.

Example of macro definitions:-


A program that is to be run on SIC system could invoke MACROS whereas a program to be run on
SIC/XE can invoke MACROX.
However, defining MACROS or MACROX does not define RDBUFF and WRBUFF. These definitions are
processed only when an invocation of MACROS or MACROX is expanded.
Macro Processor Algorithm and Macro Processor Algorithm and Data Structures:-
It is easy to design a two-pass macro processor
Two-pass macro processor
Pass1: process all macro definitions
Pass2: expand all macro invocation statements
Problem
Does not allow nested macro definitions
Nested macro definitions
The body of a macro contains definitions of other macros
Because all macros would have to be defined during the first pass before any macro
invocations were expanded
Solution
One-pass macro processor

Pass 1:
All macro definitions are processed
Pass 2:
All macro invocation statements are expanded
However, a two-pass macro processor would not allow the body of one macro instruction to contain
the definitions of other macros.
However, one-pass may be enough:-
Because all macros would have to be defined during the first pass before any macro invocations were
expanded.
The definition of a macro must appear before any statements that invoke that macro.
Moreover, the body of one macro can contain definitions of other macros.

MACROS (for SIC)


contains the definitions of RDBUFF and WRBUFF written in SIC instructions.
MACROX (for SIC/XE)

contains the definitions of RDBUFF and WRBUFF written in SIC/XE instructions.

2. Describe the design of two pass assembler. State the difference between single pass and
two pass translation of assembler
A) Design of 2 Pass Assemblers:

A programming language that is one step away from machine language. Each assembly language
statement is translated into one machine instruction by the assembler. Programmers must be well
versed in the computer's architecture, and, undocumented assembly language programs are
difficult to maintain. It is hardware dependent; there is a different assembly language for each CPU
series.

Pass 1
Assign addresses to all statements in the program
Save the values assigned to all labels for use in Pass 2
Perform some processing of assembler directives

Pass 2

Design of 2 Pass Assembler

Pass 1:
1. Separate the Symbol, Mnemonic opcode, and operand fields
2. Build the symbol table
3. Perform LC Processing
4. Construct Intermediate Representation
Pass 2:
1. Assemble instructions
2. Generate data values defined by BYTE, WORD
3. Perform processing of assembler directives not done in Pass 1
4.Write the object program and the assembly listing

Synthesize the target program


Advanced Assembler Directives
ORIGIN
EQU
ORIGIN

Syntax:
ORIGIN < Address Specification>

EQUSyntax:
<Symbol> EQU <Address Specification>E.g. MAXLEN EQU 4096
Pass I of Assembler
Pass I Use following Data Structures
OPTAB
SYMTAB
LITTAB
POOLTAB
OPTAB

The main reason why most assemblers use a 2-pass system is to address the problem of forward
references references to variables or subroutines that have not yet been encountered when
parsing the source code. A strict 1-pass scanner cannot assemble source code which contains
forward references. Pass 1 of the assembler scans the source, determining the size and address of
all data and instructions; then pass 2 scans the source again, outputting the binary object code.
Some assemblers have been written to use a 1.5 pass scheme, whereby the source is only scanned
once, but any forward references are simply assumed to be of the largest size necessary to hold any
native machine data type. The unknown quantity is temporarily filled in as zero during pass 1 of the
assembler, and the forward reference is added to a fix-up list. After pass 1, the .5 the pass goes
through the fix-up list and patches the output machine code with the values of all resolved forward
references. This can result in sub-optimal opcode construction but allows for a very fast assembly
phase.

Differences single pass v/s two pass translation of assembler:

Two pass assembler. Single pass assembler.


Performs two passes. Performs single passes.
In first pass it collects labels and symbols and In first itself it collects the symbols and labels
in second pass it assembles the instructions. and assembles the instructions.
It stores mnenmonics and pseudo codes It stores all mnenmonics and pseudo codes in a
separately i.e., MOT and POT respectively. single table MOT itself.
Literals and symbols are stored in symbol and All entities for symbols and literals are entered
literal table respectively. into symbol table only.

3. Define macro expansion counter. Mention its functions.


Macros are used to provide a program generation facility through macro expansion.
Many languages provide build-in facilities for writing macros like PL/I, C, Ada AND C++.
Assembly languages also provide such facilities.
When a language does not support build-in facilities for writing macros what is to be done?
A programmer may achieve an equivalent effect by using generalized preprocessors or
software tools like Awk of Unix. A MACRO
Def: A macro is a unit of specification for program generation through expansion.
A macro consists of
a name,
a set of formal parameters and
a body of code.
The use of a macro name with a set of actual parameters is replaced by some code
generated from its body.
This is called macro expansion.
Two kinds of expansion can be identified.
Lexical expansion:
Lexical expansion implies replacement of a character string by another character string
during program generation.
Lexical expansion is to replace occurrences of formal parameters by corresponding
actual parameters.
Semantic expansion:
Semantic expansion implies generation of instructions tailored to the requirements of a
specific usage.
Semantic expansion is characterized by the fact that different uses of a macro can lead to
codes which differ in the number, sequence and opcodes of instructions.
Eg: Generation of type specific instructions for manipulation of byte and word operands.

EXAMPLE
The following sequence of instructions is used to increment the value in a memory word
by a constant.
1. Move the value from the memory word into a machine register.
2. Increment the value in the machine register.
3. Move the new value into the memory word.
Since the instruction sequence MOVE-ADD-MOVE may be used a number of times in a
program, it is convenient to define a macro named INCR.
Using Lexical expansion the macro call INCR A, B, AREG can lead to the generation of a
MOVEADD-MOVE instruction sequence to increment A by the value of B using AREG to
perform the arithmetic.
Use of Semantic expansion can enable the instruction sequence to be adapted to the types
of A and B. For example an INC instruction could be generated if A is a byte operand and B
has the value 1
MACRO EXPANSION
Macro call leads to macro expansion.
During macro expansion, the macro call statement is replaced by a sequence of assembly
statements.
How to differentiate between the original statements of a program and the statements
resulting from macro expansion?
Ans: Each expanded statement is marked with a + preceding its label field.
Two key notions concerning macro expansion are
A. Expansion time control flow: This determines the order in which model statements are
visited during macro expansion.
B. Lexical substitution: Lexical substitution is used to generate an assembly statement
from a model statement.
A. EXPANSION TIME CONTROL FLOW
The default flow of control during macro expansion is sequential.
In the absence of preprocessor statements, the model statements of a macro are visited
sequentially starting with the statement following the macro prototype statement and
ending with the statement preceding the MEND statement.
What can alter the flow of control during expansion?
A preprocessor statement can alter the flow of control during expansion such that
Conditional Expansion: some model statements are either never visited during expansion,
or
Expansion Time Loops: are repeatedly visited during expansion.
The flow of control during macro expansion is implemented using a macro expansion counter
(MEC)
ALGOTIRHM (MACRO EXPANSION)
MEC:=statement number of first statement following the prototype stmt.
2. While statement pointed by MEC is not a MEND statement.
a. If a model statement then
i. Expand the statement
ii. MEC:=MEC+1;
b. Else (i.e. a preprocessor statement)
MEC:= new value specified in the statement.
3. Exit from macro expansion.
B. LEXICAL SUBSTITUTION
A model statement consists of 3 types of strings.
An ordinary string, which stands for itself.
The name of a formal parameter which is preceded by the character &.
The name of a preprocessor variable, which is also preceded by the character &.
During lexical expansion, strings of type 1 are retained without substitution.
String of types 2 and 3 are replaced by the values of the formal parameters or
preprocessor variables.
Rules for determining the value of a formal parameter depends on the kind of parameter:
Positional Parameter
Keyword Parameter
Default specification of parameters
Macros with mixed parameter lists
Other uses of parameter
Functions of Macro Call & Expansion:
The operation defined by a macro can be used by writing a macro name in the mnemonic field and
its operand in the operand field. Appearance of the macro name in the mnemonic field leads to a
macro call. Macro call replaces such statements by sequence of statement comprising the macro.
This is known as macro expansion.
Macro Facilities
Use of AIF & AGO allows us alter the flow of control during expansion.
Loops can be implemented using expansion time variables.

4) Explain the general format of macro prototype statement and macro-call.


Introduction

A) The format of a macro is a standard text file of thinBasic source statements stored in the SPFLite
MACROS folder.
For most users, the full path of this folder will be:

C:\Users\username\My Documents\SPFLite\MACROS
or
C:\Documents and Settings\username\My Documents\SPFLite\MACROS

The macro should be named macname.MACRO, where macname is the name with which you will
be invoking the macro.

The string "macname" is also what is returned by the Get_MacName$ function. As with everything
else in SPFLite, macro names are case insensitive.

Further, for line-command macros, the name must be even more restricted; the name cannot
contain any digits, because they will be confused with numeric line-command operands. For line
commands, the macro name should normally only contain letters to avoid problems. A few special
characters were found to be acceptable, such as $ and @. We have not exhaustively tested every
possible character, but if you can create a legal Windows file name with it, it will usually be valid as
a macro name, except for the characters + - * ? : . / and \.

Line Macro names and Block Mode


For single line macros (e.g. do THIS to a single line) there is no problem, the line command = the
macro name.

When you want to create a macro that can act as a block mode command (like CC/CC) the following
methods are used:

Repeating the last character of the line command. Using the same example as above, the
block form of AX would be requested by entering AXX/AXX, and the macro still stored
as AX.MACRO. This applies also to longer macro names. For example the block mode
version of a macro called BOX would be BOXX, or for a macro called PRINT it would
be PRINTT.

Specifically tell SPFLite in what mode a macro name is to be treated. This is done by issuing
a SET command of the format:
SET MACROMODE.macname = BLOCK | LINE
If a SET MACROMODE has been issued for a macro name, the convention about repeating the last
letter does not apply. Any length or format of macro name can be set unconditionally to a specific
processing mode.
The only requirement SPFLite imposes on the format of a macro file is the macro prototype (or
header), which must be the first line in the macro. Like everything else in SPFLite, the prototype is
generally case-insensitive, unless it has default operands used as string values in FIND or CHANGE
commands, for instance.

Macro Prototype

The first line of a macro must always be the macro prototype. This is a simple thinBasic comment
statement of the format:
' macname.MACRO [ def-operand-1 def-operand-2 ... ]
The .MACRO part of the prototype is required. macname itself is optional, but should normally be
the name of the macro. The brackets shown mean the list of operands is optional; don't actually
code brackets here.

Currently, SPFLite does not demand that macname, if coded, match the actual file name of the
macro. However, for documentation purposes, it is best that you do make the name agree, just to
keep things straight as you develop and use your macros. It is possible that a later release of
SPFLite will require that the prototype name agree with the file name, so it's best to make them
agree now.

You may optionally enter default values for macro operands (if your macro uses command-line
operands). These are simple space-delimited strings which provide defaults if the relative operand
number is not overridden when the macro is called. For example, given the following prototype:

' sample.MACRO aaa bbb ccc

If the macro were invoked with the primary command line as Sample with no supplied arguments,
then if the executing macro requested the number of operands via Get_Arg_Count it would receive
3. Get_Arg$(1) would be aaa, Get_Arg$(2) would be bbb, and Get_Arg$(3) would be ccc.

More details on retrieving macro operands and working with them will be found in Accessing
Command Line Operands.

The macro instruction prototype statement (hereafter called the prototype statement) specifies the
mnemonic operation code and the format of all macro instructions that you use to call the macro
definition.
The prototype statement must be the second non-comment statement in every macro definition.
Both ordinary comment statements and internal comment statements are allowed between the
macro definition header and the macro prototype. Such comment statements are listed only with
the macro definition.
>>-+------------+--operation_field------------------------------>
'-name_entry-'
>--+------------------------+----------------------------------><
| .-,------------------. |
|V ||
'---symbolic_parameter-+-'
name_entry
is a variable symbol.

You can write this parameter, similar to the symbolic parameter, as the name entry of a macro
prototype statement. You can then assign a value to this parameter from the name entry in the
calling macro instruction.
If this parameter also appears in the body of a macro, it is given the value assigned to the parameter
in the name field of the corresponding macro instruction.

operation_fieldis an ordinary symbol.

The symbol in the operation field of the prototype statement establishes the name by which a
macro definition must be called. This name becomes the operation code required in any macro
instruction that calls the macro.

Any operation code can be specified in the prototype operation field. If the entry is the same as an
assembler or a machine operation code, the new definition overrides the previous use of the
symbol. The same is true if the specified operation code has been defined earlier in the program as
a macro, in the operation code of a library macro, or defined in an OPSYN instruction as equivalent
to another operation code.

Macros that are defined inline may use any ordinary symbol, up to 63 characters in length, for the
operation field. However, operating system rules may prevent some of these macros from being
stored as member names in a library.

The assembler requires that the library member name and macro name be the same; otherwise
error diagnostic message ASMA126S Library macro name incorrect is issued.

symbolic_parameter
The symbolic parameters are used in the macro definition to represent the operands of the
corresponding macro instruction. A description of symbolic parameters appears under Symbolic
parameters.

The operand field in a prototype statement lets you specify positional or keyword parameters.
These parameters represent the values you can pass from the calling macro instruction to the
statements within the body of a macro definition.

The operand field of the macro prototype statement must contain 0 to 32000 symbolic parameters
separated by commas. They can be positional parameters or keyword parameters, or both.

If no parameters are specified in the operand field and if the absence of the operand entry is
indicated by a comma preceded and followed by one or more spaces, remarks are allowed.

The following is an example of a prototype statement:

&NAME MOVE &TO,&FROM

The prototype statement can be specified in one of the following three ways:

The normal way, with all the symbolic parameters preceding any remarks
An alternative way, allowing remarks for each parameter
A combination of the first two ways
The continuation rules for macro instructions are different from those for machine or assembler
instruction statements. This difference is important for those who write macros that override a
machine/assembler mnemonic.

The following examples show the normal statement format (&NAME1), the alternative statement
format (&NAME2), and a combination of both statement formats (&NAME3):

Name operation Operand Comment Cont.

&NAME1 OP1 &OPERAND1,&OPERAND2,&OPERAND3 This is the normal X


statement format

&NAME2 OP2 &OPERAND1, This is the alter- X


&OPERAND2 native statement format

&NAME3 OP3 &OPERAND1, This is a combination X


&OPERAND2,&OPERAND3, of both X
&OPERAND4

Notes:

1. Any number of continuation lines is allowed. However, each continuation line must be
indicated by a non-space character in the column after the end column on the preceding
line.
2. For each continuation line, the operand field entries (symbolic parameters) must begin in
the continue column; otherwise, the whole line and any lines that follow are considered to
contain remarks.

No error diagnostic message is issued to indicate that operands are treated as


remarks in this situation. However, the FLAG(CONT) assembler option can be
specified so that the assembler issues warning messages if it suspects an error in a
continuation line.

3. The standard value for the continue column is 16 and the standard value for the end column
is 71.
4. A comma is required after each parameter except the last. If you code excess commas
between parameters, they are considered null positional parameters. No error diagnostic
message is issued.
5. One or more spaces is required between the operand and the remarks.
6. If the DBCS assembler option is specified, the continuation features outlined in Continuation
of double-byte data apply to continuation in the macro language. Extended continuation
may be useful if a macro keyword parameter contains double-byte data.

MACRO CALLS:
A macro call consists of a name optionally followed by an actual parameter list. The number of
parameters in the actual parameter list must be the same as the number of formal parameters
specified in the definition of the macro. If the macro has no formal parameter list, its call must have
no actual parameter list.

macro_call = name [actual_parameter_list]


actual_parameter_list =
"@(" actpar { "@," actpar } "@)"
actpar = expression |
( whitespace "@""" expression
"@""" whitespace )
whitespace = {" " | eol }

FunnelWeb allows parameters to be passed directly, or delimited by special double quotes. Each
form is useful under different circumstances. Direct specification is useful where the parameters
are short and can be all placed on one line. Double quoted parameters allow whitespace on either
side (that is not considered part of the parameter) and are useful for laying out rather messy
parameters. Here are examples of the two forms.

@<Generic Loop@>@(
@"x:=1;@" @,
@"x<=10;@" @,
@"print "x=%u, x^2=%u",x,x*x;
x:=x+1;@+@"
@)

@<Colours@>@(red@,green@,blue@,yellow@)

The two forms may be mixed within the same parameter list.

Experience has shown that, in most FunnelWeb files, the vast majority of macros have no
parameters.

5. Write about Storage organization, stack allocation of space in detail?

Storage Organization

We are discussing storage organization from the point of view of the compiler, which must allocate
space for programs to be run. In particular, we are concerned with only virtual addresses and treat
them uniformly.

This should be compared with an operating systems treatment, where we worry about how to
effectively map this configuration to real memory. For example see these two diagrams in my OS
class notes, which illustrate an OS difficulty with our allocation method, which uses a very large
virtual address range. Perhaps the most straightforward solution uses multilevel page tables .

Some system require various alignment constraints. For example 4-byte integers might need to
begin at a byte address that is a multiple of four. Unaligned data might be illegal or might lower
performance. To achieve proper alignment padding is often used.
Areas (Segments) of Memory

As mentioned above, there are various OS issues we are ignoring, for example the mapping from
virtual to physical addresses, and consequences of demand paging. In this class we simply allocate
memory segments in virtual memory let the operating system worry about managing real memory.
In particular, we consider the following four areas of virtual memory.

1. The code (often called text in OS-speak) is fixed size and unchanging (self-modifying code is
long out of fashion). If there is OS support, the text could be marked execute only (or
perhaps read and execute, but not write). All other areas would be marked non-executable
(except for systems like lisp that execute their data).

2. There is likely data of fixed size whose need can be determined by the compiler by
examining the program's structure (and not by determining the program's execution
pattern). One example is global data. Storage for this data would be allocated in the next
area right after the code. A key point is that since the code and this area are of fixed size that
does not change during execution, they, unlike the next two areas, have no need for an
expansion region.

3. The stack is used for memory whose lifetime is stack-like. It is organized into activation
records that are created as a procedure is called and destroyed when the procedure exits. It
abuts the area of unused memory so can grow easily. Typically the stack is stored at the
highest virtual addresses and grows downward (toward small addresses). However, it is
sometimes easier in describing the activation records and their uses to pretend that the
addresses are increasing (so that increments are positive).

4. The heap is used for data whose lifetime is not as easily described. This data is allocated by
the program itself, typically either with a language construct, such as new, or via a library
function call, such as malloc(). It is deal located either by another executable statement,
such as a call to free (), or automatically by the system.

Static versus Dynamic Storage Allocation

Much (often most) data cannot be statically allocated. Either its size is not known at compile time or
its lifetime is only a subset of the program's execution.

Modern languages, including newer versions of Fortran, support both static and dynamic allocation
of memory.

The advantage supporting dynamic storage allocation is the increased flexibility and storage
efficiency possible (instead of declaring an array to have a size adequate for the largest data set;
just allocate what is needed). The advantage of static storage allocation is that it avoids the runtime
costs for allocation/deal location and may permit faster code sequences for referencing the data.

An (unfortunately, all too common) error is a so-called memory leak where a long running program
repeated allocates memory that it fails to delete, even after it can no longer be referenced. To avoid
memory leaks and ease programming, several programming language systems employ automatic
garbage collection. That means the runtime system itself determines when data can no longer be
referenced and automatically deal locates it.
Stack Allocation of Space

The scheme to be presented achieves the following objectives.

1. Memory is shared by procedure calls that have disjoint durations. Note that we are not able
to determine disjointness by just examining the program itself (due to data dependent
branches among other issues).
2. The relative address of each (visible) nonlocal variable is constant throughout each
execution of a procedure. Note that during this execution the procedure can call other
procedures.

Activation Trees

Recall the Fibonacci sequence 1,1,2,3,5,8, ... defined by f(1)=f(2)=1 and, for n>2, f(n)=f(n-1)+f(n-2).
Consider the function calls that result from a main program calling f(5). Surrounding the more-
general pseudo code that calculates (very inefficiently) the first 10 Fibonacci numbers, we show the
calls and returns that result from main calling f(5). On the left they are shown in a linear fashion
and, on the right, we show them in tree form. The latter is sometimes called the activation tree or
call tree.

System starts main int a[10];


enter f(5) int main(){
enter f(4) int i;
enter f(3) for (i=0; i<10; i++){
enter f(2) a[i] = f(i);
exit f(2) }
enter f(1) }
exit f(1) int f (int n) {
exit f(3) if (n<3) return 1;
enter f(2) return f(n-1)+f(n-2);
exit f(2) }
exit f(4)
enter f(3)
enter f(2)
exit f(2)
enter f(1)
exit f(1)
exit f(3)
exit f(5)
main ends
We can make the following observations about these procedure calls.

1. If an activation of p calls q, then that activation of p terminates no earlier than the activation
of q.

2. The order of activations (procedure calls) corresponds to a preorder traversal of the call
tree.
3. The order of de-activations (procedure returns) corresponds to post order traversal of the
call tree.

4. If execution is currently in an activation corresponding to a node N of the activation tree,


then the activations that are currently live are those corresponding to N and its ancestors in
the tree.

5. These live activations were called in the order given by the root-to-N path in the tree, and
the returns will occur in the reverse order.

You might also like