You are on page 1of 15

1

Basic Assembler
Functions
Basic assembler functions 2
Function of an assembler
Assembly
language
program
Assembler
Database
Machine
language and
other information
for the loader
2
Basic assembler functions 3
Fundamental assembler
functions
Translate mnemonic operation codes
their machine language equivalents.
Assign machine addresses to symbolic
labels used by the programmer.
The features and design of an assembler depend heavily on
the source language it translates and
the machine language it produces.
e.g., the instruction format and addressing modes
Basic assembler functions 4
Assembler directives
The assembler can also process assembler
directives.
Assembler directives (or pseudo-instructions)
provide instructions to the assembler itself. They
are not translated into machine instructions.
3
Basic assembler functions 5
Assembler directives (contd)
The SIC assembler language has the following assembler
directives.
START Specify name and staring address for the program
END Indicate the end of the source program and (optionally)
specify the first executable instruction in the program
BYTE Generate character or hexadecimal constant, occupying
as many bytes as needed to represent the
constant
WORD Generate one-word integer constant
RESB Reserve the indicated number of bytes for a data area
RESW Reserve the indicated number of words for a data area
Basic assembler functions 6
Example of a SIC assembler
language program
a forward reference: a reference to a label that is
defined later in the program
4
Basic assembler functions 7
Example of a SIC assembler
language program (contd)
Lines beginning with . contain comments only.
Basic assembler functions 8
Example of a SIC assembler
language program (contd)
5
Basic assembler functions 9
Translation of source program to
object code
Require to accomplish the following functions
Convert mnemonic operation codes to their machine language equivalents
e.g. translate STL to 14
process assembler directives
Convert symbolic operands to their equivalent machine addresses
e.g. translate RETADR to 1033
handle forward references
two passes
the first pass scans the source program for label definitions and assigns addresses
the second performs most of the actual translation.
Build the machine instructions in the proper format
Convert the data constants specified in the source program into their internal
machine representation
e.g. translate EOF to 454F46
Write the object program and the assembly listing
Object program format
Basic assembler functions 10
Program with object codes
RETADR
STL
a large memory space
6
Basic assembler functions 11
Program with object codes (contd)
Basic assembler functions 12
Program with object codes (contd)
7
Basic assembler functions 13
Format of SIC object program
Address of first executable instruction in object program (hex) Col. 2-7
E Col. 1
End record:
Object code, represented in hexadecimal (2 columns per byte of object code) Col. 10-69
Length of object code in this record in bytes (hex.) Col. 8-9
Starting address for object code in this record (hex.) Col. 2-7
T Col. 1
Text record:
Length of object program in bytes (hex.) Col. 14-19
Starting address of object program (hex.) Col. 8-13
Program name Col. 2-7
H Col. 1
Header record:
Basic assembler functions 14
Object program
No object code corresponds to addresses 1033-2038.
This storage is reserved by the loader for use by the
program during execution.
8
Basic assembler functions 15
A simple two-pass assembler
Pass 1 (define symbols)
Assign addresses to all statements in the program.
Save the values(addresses) assigned to all labels for use in Pass 2.
Perform some processing of assembler directives.
Include processing that affects address assignment such as determining the
length of data areas defined by BYTE, RESW, etc.
Pass 2 (assemble instructions and generate object program)
Assemble instructions.
translate operation codes
look up addresses
Generate data values defined by BYTE, WORD, etc.
Perform processing of assembler directives not done during Pass 1.
Write the object program and the assembly listing.
Assembler Algorithm and
Data Structures
9
Basic assembler functions 17
Internal data structures
the Operation Code Table (OPTAB)
OPTAB is used to look up mnemonic operation codes
and translate them to their machine language
equivalents.
the Symbol Table (SYMTAB)
SYMTAB is used to store values (addresses) assigned
to labels.
a Location Counter (LOCCTR)
This is a variable that is used to help in the
assignment of address.
Basic assembler functions 18
Internal data structures (contd)
Source
Program
SYMTAB
OPTAB
LOCCTR
Intermediate
file
Pass 1 of
assembler
Object
Program
Pass 2 of
assembler
10
Basic assembler functions 19
OPTAB
In most cases, OPTAB is a static table.
OPTAB must contain the mnemonic operation
code and its machine language equivalent
In more complex assemblers, OPTAB also
contains information about instruction format
and length.
OPTAB is usually organized as a hash table,
with mnemonic operation code as the key.
Basic assembler functions 20
OPTAB (contd)
class optab_entry {
public:
char mnemonic[6];
char opcode;
};
11
Basic assembler functions 21
SYMTAB
SYMTAB includes the name and value
(address) for each label in the source program
together with flags to indicate error conditions.
e.g., a symbol defined in two different places
This table may also contain information, such
as type or length, about the data area or
instruction labeled.
SYMTAB is usually organized as a hash table
for efficiency of insertion and retrieval.
the label is the key of SYMTAB.
non-random key
Basic assembler functions 22
SYMTAB (contd)
class symtab_entry {
public:
char symbol[8];
int address;
};
12
Basic assembler functions 23
Intermediate file
Pass 1 usually generates an intermediate file that
contains
each source statement together with its assigned address,
error indicators, etc.
This file is used as the input to Pass 2.
This file retains some results of operations
performed during Pass 1
the scanned operand field for symbols and addressing
flags
pointers into OPTAB and SYMTAB for each operation
code and symbol used.
Basic assembler functions 24
LOCCTR
LOCCTR is a variable.
LOCCTR is initialized to the beginning address
specified in the START statement.
After each source statement is processed, the
length of the assembled instruction or data
area to be generated is added to LOCCRT.
When a label is reached, the current value of
LOCCTR gives the address to be associated
with that label.
13
Basic assembler functions 25
Algorithm for Pass 1
Basic assembler functions 26
Algorithm for Pass 1 (contd)
14
Basic assembler functions 27
Algorithm for Pass 2
Basic assembler functions 28
Algorithm for Pass 2 (contd)
trace
an object
program
15
Basic assembler functions 29
Table processing
Searching in a table
Key: symbol name
Methods:
Linear search: compare exhaustively every entry in
the table
Binary search:
ordered table
sorting
Hash

You might also like