Professional Documents
Culture Documents
in
Lecture plan
Period :1
Introduction
An assembler is system software that accepts an assembly language program as its input
and produces its machine language equivalent along with information for the loader as its output.
It is a translator that converts the assembly language program into machine language program.
.in
The structure of the assembler is given as
Machine language
be
Assembly language
program and extra
Program Assembler information for
loading
tu
Data structures
se
(Ex) symbol table,
opcode table
.c
www.csetube.in
www.csetube.in
Lecture plan
Period : 1
Machine language program
Each line of assembly language instruction is translated to machine language. The
machine language takes two forms depending on the architecture. They are
1. hexadecimal form
.in
2. binary form
SIC takes hexadecimal form of machine code.
be
Data structures
Assembler built or uses one or more data structure to perform the assembling process.
Some of the data structures are SYMTAB and optab.
tu
Basic assembler functions
se
Fundamental functions of an assembler
– A simple SIC assembler
.c
www.csetube.in
www.csetube.in
Lecture plan
Period :2
Assembler Directives
• Basic assembler directives (pseudo instructions)
– START
Specify name and starting address for the program
.in
– END
Indicate the end of the source program, and (optionally) the first executable
instruction in the program.
be
– BYTE
Generate character or hexadecimal constant, occupying as many bytes as
needed to represent the constant.
tu
– WORD
Generate one-word integer constant
se
– RESB
Reserve the indicated number of bytes for a data area
–
.c
RESW
Reserve the indicated number of words for a data area
w
SIC Assembler
w
• Assembler‟s task
– Convert mnemonic operation codes to their machine language equivalents
w
www.csetube.in
www.csetube.in
Lecture plan
Period :2
Forward Reference
• Definition
– A reference to a label that is defined later in the program
• Solution
.in
– Two passes
• First pass: scan the source program for label definition, address
accumulation, and address assignment
be
• Second pass: perform most of the actual instruction translation
tu
se
.c
w
w
w
www.csetube.in
www.csetube.in
Lecture plan
Period :3
– Assemble instructions (generate opcode and look up addresses)
– Generate data values defined by BYTE, WORD
– Perform processing of assembler directives not done during Pass 1
– Write the object program and the assembly listing
.in
Assembler algorithm and data structures
• Operation Code Table (OPTAB)
be
• Symbol Table (SYMTAB)
• Location Counter (LOCCTR)
tu OPTAB
se
Pass 1 Pass 2
.c
SYMTAB
LOCCT Object program
R
w
OPTAB
• Contents:
– Mnemonic operation codes
– Machine language equivalents
– Instruction format and length
• During pass 1:
www.csetube.in
www.csetube.in
Lecture plan
Period :3
– Validate operation codes
– Find the instruction length to increase LOCCTR
• During pass 2:
.in
– Determine the instruction format
– Translate the operation codes to their machine language equivalents
• Implementation: a static hash table
be
LOCCTR
• A variable accumulated for address assignment, i.e., LOCCTR gives the address of the
associated label. tu
• LOCCTR is initialized to be the beginning address specified in the “start” statement.
• After each source statement is processed during pass 1, instruction length or data area is
se
added to LOCCTR.
SYMTAB
•
.c
Contents:
– Label name
–
w
Label address
– Flags (to indicate error conditions)
w
www.csetube.in
www.csetube.in
Lecture plan
Period :4
.in
Instruction Format and Addressing Mode
» PC-relative or Base-relative addressing: op m
be
» Indirect addressing: op @m
» Immediate addressing: op #c
»
»
Extended format: +op m
Index addressing: op m,x
tu
» register-to-register instructions
se
» larger memory -> multi-programming (program allocation)
Translation
.c
Register translation
» register name (A, X, L, B, S, T, F, PC, SW) and their values (0,1, 2, 3, 4, 5, 6, 8,
w
9)
» preloaded in SYMTAB
w
Address translation
» Most register-memory instructions use program counter relative or base relative
w
addressing
» Format 3: 12-bit address field
– base-relative: 0~4095
– pc-relative: -2048~2047
» Format 4: 20-bit address field
– pc-relative first
www.csetube.in
www.csetube.in
Lecture plan
Period :5
Relative Addressing Modes
PC-relative
» e.g. 10 0000 FIRST STL RETADR 17202D
– displacement= RETADR - PC = 30-3 = 2D
.in
» e.g. 40 0017 J CLOOP 3F2FEC
– displacement= CLOOP - PC = 6 - 1A = -14 = FEC
Base-relative
be
» base register is under the control of the programmer
» e.g. 12 LDB #LENGTH
»
»
e.g. 13
e.g. 160
BASE
104E
LENGTH tu
STCH BUFFER, X 57C003
– displacement= BUFFER - B = 0036 - 0033 = 3
se
» NOBASE is used to inform the assembler that the contents of the base register no
longer be relied upon for addressing
.c
Address Translation
Immediate addressing
w
www.csetube.in
www.csetube.in
Lecture plan
Period :5
Program Relocation
Example Fig. 2.1
» Absolute program, starting address 1000
e.g. 55 101B LDA THREE 00102D
.in
» Relocate the program to 2000
e.g. 55 101B LDA THREE 00202D
» Each Absolute address should be modified
be
Example Fig. 2.5:
» tu
Except for absolute address, the rest of the instructions need not be modified
– not a memory address (immediate addressing)
– PC-relative, Base-relative
se
» The only parts of the program that require modification at load time are those that
specify direct addresses
.c
w
w
w
www.csetube.in
www.csetube.in
Lecture plan
Period :6
Relocatable Program
Modification record
» Col 1 M
» Col 2-7 Starting location of the address field to be modified, relative
.in
to the beginning of the program
» Col 8-9 length of the address field to be modified, in half- bytes
be
Object Code
tu
se
.c
w
w
w
10
www.csetube.in
www.csetube.in
Lecture plan
Period :7
Machine dependent assembler features
Assembler features not closely related to machine architecture
• Literals
• Symbol-defining statements
• Expressions
.in
• Program blocks
• Control sections and program linking
be
Literals
It is convenient for the programmer to be able to write the value of a constant operand as a part of
the instruction that uses it. Such an operand is called a literal.
...
1076 * =X„05‟ 05
w
• In this assembler language notation, a literal is identified with the prefix=, which is followed by a
w
With immediate addressing, the operand value is assembled as a part of the machine
instruction.
With a literal, the assembler generates the specified value as a constant at
some other memory location. The address of this generated constant is used
as the target address for the machine instruction.
11
www.csetube.in
www.csetube.in
Lecture plan
Period :7
Literal pool
All of the literal operands used in a program are gathered together into one or more literal
pools.
.in
Where the literal pool should be placed?
93 LTORG
be
002D * =C„EOF‟ 454F46
� The assembler directive LTORG tells the assembler to generate a literal pool here.
Literal for current value of location counter
tu
� The value of the location counter can be denoted by a literal operand *.
• BASE *
se
• LDB =*
Handling duplicate literal operands
The assembler should avoid storing duplicate literals.
.c
12
www.csetube.in
www.csetube.in
Lecture plan
Period :8
Processing literal operands
Pass 1
• For each recognized literal operand, search LITTAB. If the literal is already present in the
table, no action is need; if it is not present, the literal is added to LITTAB without assigning
.in
its address.
• When a LTORG statement is encountered or the end of the program, the assembler makes a
scan of LITTAB and assigns an address to each literal.
be
• Update the location counter to reflect the number of bytes occupied by each literal.
Pass 2
• Search LITTAB for each literal operand encountered.
tu
• The data values specified by the literals in each literal pool are inserted at the appropriate
places in the object program.
se
• In the same way as these values generated by BYTE or WORD statements.
• If a literal value represents an address in the program, the assembler must generate the
.c
13
www.csetube.in
www.csetube.in
Lecture plan
Period :8
Symbol-defining statements
Assembler directives
EQU
ORG
.in
Assembler directive: EQU
Most assemblers provide an assembler directive that allows the programmer to define
be
symbols and specify their values.
Symbol EQU value
When the assembler encounters the EQU statement, it enters “symbol” into SYMTAB with
the value of “symbol”
Use of EQU
tu
se
Establish symbolic names that can be used for improved readability in place of numeric
values.
+LDT #4096
.c
X EQU 1
L EQU 2
w
Establish and use names that reflect the logical function of the registers in the program.
BASE EQU R1
COUNT EQU R2
INDEX EQU R3
14
www.csetube.in
www.csetube.in
Lecture plan
Period :9
Assembler directive: ORG
The assembler directive ORG is usually used to indirectly assign values to symbols.
ORG value
“Value” is a constant or an expression involving constants and previously defined
.in
symbols.
When this statement is encountered, the assembler resets its location counter (LOCCTR) to
the specified value.
be
Use ORG for label definition
Suppose that we want to define a table with the following structure.
STAB
100
SYMBOL
6 bytes
tu
VALUE
3 bytes 2 bytes
FLAGS
entries
se
In some assemblers, the previous value of LOCCTR is automatically remembered, so we can
write
.c
ORG
to return to the normal use of LOCCTR.
w
w
For an ordinary two-pass assembler, all symbols must be defined during Pass 1. Hence, the
following sequences could not be processed by an ordinary two-pass assembler.
w
All terms used to specify the value of the new symbol must have been defined previously in
the program.
15
www.csetube.in
www.csetube.in
Lecture plan
Period :9
.in
DELTA RESW 1 disallowed
ORG ALPHA
be
BYTE1 RESB 1
BYTE2 RESB 1
BYTE3 RESB 1 tu
ORG
ALPHA RESB 1 disallowed
se
.c
ALPHA RESW 1
w
16
www.csetube.in
www.csetube.in
Lecture plan
Period :10
Expressions
Most assemblers allow the use of expressions whenever a single operand such as a label or
literal is permitted.
.in
• Each such expression must be evaluated by the assembler to produce a single
operand address or value.
be
Assemblers generally allow arithmetic expressions formed according to the normal rule using
the operators +,-,*, and /.
• Individual terms in the expression may be
• constants,
tu
• user-defined symbols, or
• special terms.
se
• The most common special term is the current value of the location
counter (often designated by *)
.c
w
Types of terms
w
program.
Types of expressions
By the type of value produced, expressions can classified as
1. Absolute expressions
• The value of an absolute expression is independent of the program location.
• The absolute expression may contains relative terms provided the
17
www.csetube.in
www.csetube.in
Lecture plan
Period :10
relative terms occur in pairs and the terms in each such pair have opposite signs. No relative term
can enter multiplication or division operation.
• e.g. MAXLEN EQU BUFEND-BUFFER
2. Relative expressions
.in
• The value of a relative expression is relative the beginning address of the object
program.
• A relative expression is one in which all of the relative terms except
be
one can be paired as described above. The remaining unpaired term must have a positive sign. No
relative term can enter multiplication or division operation.
Expressions that are neither relative nor absolute should be flagged by the assembler as errors.
tu
Determining types of expressions
se
Add this
Symbol Type Value field to
MAXLEN A 1000 SYMTAB
.c
BUFEND R 1036
BUFFER R 0036
w
RETADR R 0030
w
Program blocks
Assembler directive: USE
USE indicates which portions of the source program belong to the various blocks.
18
www.csetube.in
www.csetube.in
Lecture plan
Period :11
Control section and program linking
Control section
i. A control section is a part of the program that maintains its identity after assembly.
ii. Each control section can be loaded and relocated independently of the others.
.in
iii. Different control sections are most often used for subroutines or other logical
subdivisions of a program.
Assembler directive: CSECT
be
CSECT: signal the start of a new control section.
The assembler establishes a separate location counter (initialized as 0) for each control section.
elsewhere.
w
19
www.csetube.in
www.csetube.in
Lecture plan
Period :12
One pass assemblers and Multi pass assemblers
One-Pass Assemblers
Scenario for one-pass assemblers
Generate their object code in memory for immediate execution – load-and-go
.in
assembler.
External storage for the intermediate file between two passes is slow or is
inconvenient to use
be
Main problem - Forward references
i. Data items
ii.
Solution
Labels on instructions tu
i. Require that all areas be defined before they are referenced.
se
ii. It is possible, although inconvenient, to do so for data items.
iii. Forward jump to instruction items cannot be easily eliminated.
.c
Omits the operand address if the symbol has not yet been defined.
Enters this undefined symbol into SYMTAB and indicates that it is undefined
w
Adds the address of this operand address to a list of forward references associated with the
SYMTAB entry.
When the definition for the symbol is encountered, scans the reference list and inserts the
address.
At the end of the program, reports the error if there are still SYMTAB entries indicated
undefined symbols.
20
www.csetube.in
www.csetube.in
Lecture plan
Period :13
Multi-Pass Assemblers
For a two pass assembler, forward references in symbol definition are not allowed:
ALPHA EQU BETA
BETA EQU DELTA
.in
DELTA RESW 1
Symbol definition must be completed in pass 1.
be
Prohibiting forward references in symbol definition is not a serious inconvenience.
Forward references tend to create difficulty for a person reading the program.
Implementation
tu
For a forward reference in symbol definition, we store in the SYMTAB:
se
i. The symbol name
iv. The undefined symbol (marked with a flag *) associated with a list of symbols
w
21
www.csetube.in
www.csetube.in
Lecture plan
Period :13
IMPLEMENTATION EXAMPLE
MASAM assembler
SPARC assembler
.in
MASAM assembler
be
Since x 86 system views memory as a collection of segments, MASAM assembler
During program execution, segments are addressed via the x 86 segment registers.
.c
1. near jump
2. far jump
22
www.csetube.in