Professional Documents
Culture Documents
Assemblers
In the finl half of the chapcer we use a simple assembly language to illustrate features
of assembly languages and techniques med in assemblers. ln this language. each
staaement ha.1 two operands. the first operand is always a register which can be any
one of ~ . BREG, CREG and DREG. The second operand refers to a memory word
using a symbolic name and an optional displacemenl (Note that indexing is not
permiued.)
=}
01
02 Fint o~rond is modi/i,J
03 NULT
Condition cod~ is ut
04 tlOVER Register +- memory move
OS NOVEM Memory +- rcgistcl' move
06 COMP Seu condition code
07 BC Branch on condition
08 DIV Analogous to SUB
09
10 ~} First o~rand is not used
Figure 4.1 lists the mnemonic opcodes for machine inslructions. The NOYE in-
structions move a value between a memory word and a ~st.er. In the MOVER in-
struction the second operand is the source operand and the first operand is the target
operand. Converse is true for the MOVEM instruction. AU arithmetic is performed
in a register (i.e. the result replaces the contents of a register) and sets a condition
cod~. A comparison instruction sets a condition code analogous to a subtract instruc-
tion without affecting the value.1 of its operands. The condition code can be tested by
a Branch on Condition (BC) insuuction. The assembly sta1ement corresponding to it
has the fonnat
BC <condition ccxk spec>, <memory addrrss>
81 Systems Programming & Operating Systems
It transfers control to the memory word with the address <memory addrrss> if the
current value of condition code matches <condition cO<k spec>. For simpliciry. we
amume <condition code s~c> to be a charac1er sb'ing with obvious meaning. e.g.
GT, EQ. etc. A BC stalemcnt with the ,condition code spec ANY implies unconditional
transfer of conlr'OI. In a machine language program, we show all addresses and con-
stants in decimal rather than in octal or hexadecimal.
Figure 4.2 shows the machine insuuctioos format. The opcode. register operand
and memory operand occupy 2. 1 and 3 digits, respectively. The sign is not a part
of the instruction. The condition code specified in a BC statement is encoded into
the first operand position using the codes 1-6 for the specifications LT, LE. EQ. GT,
GE and AIY. respectively. Figure 4.3 shows an assembly language program and an
equivalent machine language program.
STAJ\T 101
WO N 101) + 09 0 113
NOYER BREC, ONE 102) + 04 2 116
NOVEM BREC, TERM 103) + 06 2 116
AGAIN NULT BREC, TERN 104) + 03 2 116
NOYER CREG, TERN 106) ♦ 04 3 116
ADD CREG, ONE toe> + 01 3 116
NOVEi CREG, TERN 107) + 06 3 116
COMP CREG, N 108) + 06 3 113
BC LE, AGAIN 109) + 07 2 104
NOVEM BREG, RESULT 110) + 06 2 114
PRINT RESULT 111) + 10 0 114
STOP 112) + 00 0 000
N DS 1 113)
RF.SULT DS 1 114)
ONE DC ' 1, 116) + 00 0 001
TEM DS 1 116)
END
2. Declaration araneata
3. Assembler directiva.
[Label] DS <cou&..&>
[Label] DC
'<••'••>'
The DS (short for tl«kn ~ ) atatanent raa .a areas of memory ■nd as10-
ciata names with them. Consider the followina D8 1t11ements:
DS 1
'
0 DS 200
The fint atata•lcnl racna ■ IDCDQ')' area of l wud ■nd ulOCiata the name , with
it. The leCCOd ltltemem raena a block of 200 memory words. The name a is
associated with the flnt word of the block. Odler' words in the block can bt accessed
through offsets from 0, e.g. 0+6 ii the IWb wont of the memory block. etc.
The DC (short f o r ~ OOfUlt.DII) st■•aneul amttum memcxy words contaio-
ina constants. The statemall
OD DC '1'
UIOciaaea the name on with a memory word cmt1inin1 the value •1•• The program-
mer can declare comc■ati in different fonu--decima1. binary, hexwledmal, etc. The
U1Cmbler couwata lbem to the ■pproprilllle bm:nal form.
V• fJI : • rtn:nt■
Contrary to the name •clecllff CODlhnf. the DC lblla•w,I doea DO( rally implemau
comwm, it memy initi1Ji1111 memory wmts to given values. These values are not
prob'JCted by the asembler. they may be cblDged by moving a new value into the
memory word. For exunple, in fia. 4.3 the value of OD can ~ changm by executing
an instnx:tion NOYEN BRIG, OIi.
An aucmbly projiaat am me COllltlm in the teme implemeoled in an HLL in
two ways--e invnerfive opennda. and • lillnls. Immediate operands can be used
in an usembly 1t1reuia11 ODly If lbe ■n:bicmure of the target machine includes the
necessary fearures. In such• macbine, die asembly at11a11e111c
90 Systems Programming & OpcratinJ Systems
ADD .lREG,5
is translated into an instruction with two openulm-.lREG and the value ·s·
as an
immediate operand. Note that our simple assembly language does not suppon this
feature. whereas the usembly language of Intel 8086 supports it (see Section 4.5).
ADD .IAEG, FIVE
ADD lREG, •'5' ~
FIVE DC '5'
(a) (b)
A• alNitr clLecdwa
Assembler directives instruct the assembler to perform certain actions during the
assembly of a program. Some assembler directives arc described in the foUowing.
START <constanJ>
This dittctive indicates that the first word of the larget program generated by the
assembler should be placed in the memory word with address <constant>.
DD [ <operand spec>]
This directive indicates the end of the soun:e program. The optional <operand
spec> indicates the address of the inslJUction where the execution of the program
should begin. (By default, ~tion begins with the first instruction of the assembled
prog1am.)
the machine and assembly language statements of Fig. 4.3 once again. The programs ,
presently compute N!. Figure4.5 shows a changed program to compute! x N!, where
rectangular boxes arc used to highlight changes in the program. One statement has
been inserted before the PRIIT statement to implement division by 2. ln the ma-
chine language program. this leads to changes in addresses of constants and reserved
memory areas. Because of this, addresses used in most instructions of the program
bad to change. Such changes are not ~ded in the assembly program since operand
specifications are symbolic in nanuc.
START 101
READ )I 101) + 09 0
NOYER BREG, ONE 102) + 04 2
,t()VEJI BREG, TERN 103) + 06 2
AGAII NULT BREG, TERM 104) + 03 2
,tOVER CREG, TERN 105) + 04 3
ADD CREG, ONE 106) + 01 3
NOVEM CREG, TERN 107) + 06 3
CONP CREG, I 108) + 06 3
BC LE, AGAIJI 109) + 07 2 104
!01v BAEG, TWO ! 110) + 08 2 118
MOVEM 8REC, RESULT 111) + 05 2 115
PRINT RESULT 112) + 10 0 115
STOP 113) + 00 0 000
H DS 1 114)
RESULT DS 1 116)
OIE DC '1' 116) + 00 0 001
TEM•. DS 1 117)
TVO DC '2' 118) + 00 0 001
END
SyntNsi.r pha.J~
Consider tht assembly lbllel'Dellt
NOVBa BUG, OD
in Fig. 4.3. We must ba\'e the following information to synthesize the machine in-
Sb'Udion c:o1rapooding to this 1t11enient:
I. Address of the menwy WOid with which name OD is uaociared.
2. Machine operation code COl!apoi.ding to the mnemonic NOVEil
The fine item of infOl'IDIDOII depends on the soun:e POiPBm. Hence it must
be made available by the analysis phue. The IIICOod item of information does not
depend on the soun:e popmn. it merely cir.pends on the ,ssc,nt,-y languap. Hence
lbe synthesis pbue can determine dm b4onmdoo foe ielelf.
use
Based on the above diacuuioa, we cn111ider ~ of two data structures durina
the synthesis phw:
I. Symbol table
2. Mnemonics table.
Each entry of the symbol table bu two primary fieldl nanw and addrus. The t■-
ble ii built by the analysis pbae. An eotty in 11w mnemonics table bas two primary
fielcls-mnalonk and o,,ctJM. The l)•idwi• phase mes these tables to obtain the
machine addreu with which a mme is 11anr.iwd, and tbr madu11e opcode corre-
spondina to a mnemonk. reapecdvely. Hence lbe tabla have to be aearcbed with the
symbol name and lhe D1emn-: u bys.
Analysupl;4#
The primvy function performed by lhe analysis pbae is the h1ildin, of the symbol
table. For thil purpose it IDUI( determine the addret1.:s with which the symbolic
names used in a piopam _. utorielled It ii poaible to delamine some addresses
directly. e.g. the addreu of the ftnt imtn1ctioo in tbt POIIIIID. howewl ocben must
be inferred. Consider the utembly pOjlUD of Fis- 4.3. To detamioe the addlaa of
Assemblers 93
N, we must fix the addrases of all program elements pcca:ding it This function is
called fMfflDf'Y allocalion.
To implement memory allocation a dala structure called location counur {LC) is
introduad The localion counter is always made to contain the address of the next
memory word in the t1rJe1 proaram. It is initialized to the constant specified in the
START statement. Whenever the analysis phase sees a label in an assembly statement.
it enten the label and the contents of LC in a new entry of the symbol table. It then
finds the number of memory words required by the assembly stalaDe:Dt and upd11es
the LC COl'llml. (Hence the wad "counter' in 'location c:oumer• .) This ensures
that LC poim to the next memory want in the tlrJet program even when machine
instructions have different lengths and DSJDC stalemellts reserve diffenmt amounts
of memory. To update the contents of LC. analysis phase needs to know lengths of
differen1 insuuctioos. This information aim:,ly depends on the assembly languaae,
hence the mnemonics table can be extended to illclude this infonnatioa in a new field
called length. We refer to the processing imolved in maintaining the location counter
u LC pnx%ssiltg.
,..,.,.,,._ op_
onlc code ,~,, th
ADD 01 I
SUB 02 I
MDClllOllka table
sy,,tbol addru,
AGAIN 104
N 113 - Dara access
- - • Control lflmfer
Symbol table
Figure 4.6 illllllnla the me of the dala structura by the analysis and synthesis
pb"". Nole that the Mnemonics table is a fixed table which ii merely accessed by
the analysis md l)'Dlheail pbues, while the Symbol table is construc:ted durina anal-
y'lil and med during synchesia. The tub performed by the analysis and syntbesu
phases are • follows:
AMJysu pltax 1. hollte die label, mnemonic opcode and operand fields
of1.,..men1.
~ Systems Programming & C>pu:lling Systems
Data structures
Soun:e Target
Pue I PueD
Program
Proanm
- Data ICCCSS
- • .. Conll'OI transfer
Ir.ret mediare code
can be only partially synthesized since on is a forward reference. Hence the in-
sttuction opcode and address of BREO will be assembled to reside in location IOI.
The ~ for inserting the second operand's address at a later stage can be indicated
by adding an entry to the Table of Incomplete lnsuuctions (TD). This entry is a pair
(<innrvction oddnss>. < symbol> ). e.g. (101 . Oft) in this case.
By the time the EID llatement is processed, the symbol table would contain lhe
addresses of all symbols defined in the source program and Til would contain in-
formation describing all forw~ references. The assembler can now process each
entry in TD to complete the concerned imttuction. For example, the entry (101 . ONE)
would be pr-ottSscd by obtaining the address of OIi from symbol table and inserting
it in the operand address field of the instruction with assembled addres., 101. Al-
ternatively, e.ntrics in TIJ can be processed in an incremenlal manner. Thus. when
definition of some symbol sy,nb is encountered. all forward references 10 symb can
bt: processed
Pass I performs analysis of the source program and synthesis of the inlermediate rep-
resentation while Pass D processes the incennediate represenwion to synthcsii.e the
'6 S)'llfflll n1 & Opmna1 SyA:ml
target prosnm. The design decails of usrmbler pases are discussed after inb'Oduc-
llll advanced assembla din:cti\lel and their influence on LC processing..
U1 Ad♦ .. J\ 111 pHff Dbedha
OJUGII
The l)'lllU of this d:in:ctive is
OIUGII
where <addtaJ sp«> ii an <operand sp«> or <corutant>. This directive incli-
Clla that LC should be Id to the addleu pven by <addreJS spec>. The 01\IGII
ff#ement ii Uleful when the targd proji ■m does not consisc of consecutive memory
words. The ability to Ille Ill <OJWrand q,«> in the OIUGII statement provides the
ability to perform LC processing m a relatM rather than abwluJ~ manner. Exam-
ple 4.1 illusttatea the diffa~nces between the two.
IY• ph 1.1 Stalement number 18 olfia. 4.l(a). viz. OJUGI• LOOP+2. tets LC to the value
204. lincc die symbol LOOP ii ll10Ciared with lhe addras 202. The next statemcnL
viz.
NULT CUG, B
is tberefcn SMD lhe addreu 2<M. The ltallrment ORIGIN UST+t sets LC to addras
217. NOie 1h11 • equivalem effect could have been achleYed by uaina the swrmcnts
mua1• 202 and OlllGI• 217 11 lhele two places ia the prosnm. however lhe ab-
- - lddreuea uted in thele swenienll would need to be chanpd if lhe address
ipedfk:Mioa ia die STOT wntent iacbanpd.
IQU
<•,-&o1> IQU
g. gIr U Swemeaa 22 of Fia.4.8(a). viz. BACI EQU. LOOP introducn lhe symbol BACK
11D aep_. di., opaiiiCI LOOP. Tbis ii bow die I(I ICalement. viz.
BC LT, BACK
Aslemblers 97
-
1 START 200
2 NOYD &REC, •'6' 200) +o4 1 211
3 NOVEN &REC, ' 201) +o6 1 217
4 LOOP NOYD AR.BG, A 202) +o4 1 217
6 NOVEil CREG, 8 203) +06 3 218
6 ADO CREG, •'1' 204) +ol 3 212
7
LTORG
Fig. 4.4 hu shown how literals can be handled in two llepl. F'mt. the lilenl is lffllled
as if ii is a < WJlw> in a DC stllemellt. Le. a memory word conbliniq die value ot
the liaeraJ is formed. Second. this memory word is used • lhe operand in place of
the lila'al. Wbcft should the 11aembler place the word 0011re1panding IO die lilall?
Obviously. it should be placed such dUlt COlllml DeYa' reaches it durina the e.cubOII
ofa program. The LTORO llalement permill a propammer IO specify where lirerals
should be placed. By default. assembler places the litaals after the DD 1tatana11.
Al every LTORO ltllemeDI. as also at the EID IWemenl. tbe assembler aUocala
memory IO the literals of a lil~ral pool. 1be pool cootai111 all literals used in the
pctignm since the start of tbe program or since the last LTORG ~
Exemph 4.3 In Fia. 4.8, lbe lilaala =·.s• and•· t • arc added to the literal pool in ltaMlmab
2 and 6. respecti~ly. the 6ra LTORG ltllemeat (stalemenl QUfflber 13) 1Uoc1tet me
addraaes 2 I I and 212 to lbe values •5• and 'I'. A new lileral pool ii now llalUd. The
value •1• ii pul into this pool in aaateme.nt 15. This value ii allonaed die addleu 219
while proaiuina die DD wmmt. The lileral •• t' lolled iP • • 'IICIII I.S lbm(cn
refen to localioa 219 of die NCOlld pool of litenls l1llher lbal loc Miaa 212 ol lbe b
pool Thul. Ill nleawww to tienla me fm ward maenca a,,, ddnniCII.
91 Systems rroai:amming & Operating S)'llems
The LTORG directive has very tittle rdevance for the simple U&Cmbly language
we have assumed so far. The need IO allocale liaenls Bl intermediate points in the pro-
gram rather than at the end is critically felt in a computer using a base displacement
mode of addressing. e.g. compula'S of the mM 3<6'370 family.
EXERCISE 4.4.1
I . An assembly program coo&ains the swcment
I EQU Y+26
lndicaJc how the EQU saatemcnt can be pracesscd if
(a) T is a Nd refe,a,ce,
(b) Y is a forward reference.
2. Can the operand expression in an ORIGIN statement contain forward refcrcnc:es? If
so. outline how the ltalement can be a-oceaed in a two plSS aaembly xhcmc.
4.4.l Pullold leAw... ... ·
Pass I uses the following dala structures:
OPl'AB A cable of mnemonic opcodes and related infor-
mation
SYMTAB Symbol aable
LITTAB A table of litenh used in lhe program
Figw-e 4 .9 illustrates sample coments of lhese tables while processing the pro-
gram of fig. 4.8. OPl'AB contains the fields IMOIOftic opcotk. class and fflllffllOftic
info. The cl.ass field indicates wbdber the opcode conesponds to an imperative swe--
ment (IS), a declaratw11 stwrnenc (DL) ar an membler din:ctive (AD). If an imper-
ative, the mnemonic iltfo field cooblim the pair (machiM c,pcode, instruction length).
else it contains the id of a routine to handle the declaration or directive statement. A
SYMTAB entry 00Dlains the fields address and lotgdt. A LITTAB entry contains the
fields literal and addrus.
Processing of an ~ l y staremeor bq.ios widl the processing of its label field.
If it cootains a symbol. the symbol and the value in LC is copied into a new entry
of SYMTAB . Thereafter, the functioning of Pus I c:eOlal around the interprctalion
of the OPl'AB entry fm the mnemonic. The dau tieJd of the entry is examined to
determine whether the moem.,,oic beloogs to the clas.1 of impentive. declaration or
assembler directive statements. In the cme of an imperatM statancnt. the length of
the machine instruction is simply added to the LC. The length is also entered in the
SYMTAB entry of the symbol (if any) defined in the statemeot. This compleles the
prcxasing of the stllemenL
f<Jr' a dcclaratioo ar aucmbler directive staaemcnt. the routine mentioned in the
~ i,Jfo fidd ii called to pc:afu11n .tppropriale proceuioa ~the statement. For
_example, in the case of a DS Jlllel!rmt , routine R#7 would be ~led This routine
A$.'1Cmblm 99
lltltelltOllic IMOflOllic
opcotk clau info sylllbol adtlnss lmgth
MOVER IS (04, J) LOOP 202 J
~ DL R#7 NEXT 214 1
START AD RIJI LAST 216 I
.. A 217 1
BACK 202 1
OPTAB
B 218 I
SYMTAB
liuraJ addtas lil•raJ no
l =·s·
2 =· 1•
3 =· 1•
UITAB POOL.TAB
.... .U Data SUUC1lftS ~ • • e,lt)a Pw I
processes the operand field of the staacment to determine the amount of memory
required by this statnoeo~ and appropriarely updates the LC and the SYMTAB entty
of lhe symbol (if any) defined in the staaement. Similarly, for an assembler directive
th<- called routine would perform approprillle prooessing. possibly affecting lhe value
in LC.
The use of LnTAB needs some explanation. The first pas uses LrrTAB to col-
lect all literals used in a program. Awareness of different literal pools is maintained
using the auxiliary table POOLTAB. This table contains the literal number of the
starting lila'al of each lilaal pool: At any saage, the cwmtt literal pool is the last
pool in UTTAB. On cocountcring an LTORG 5'alemed (or the EID st11emenl). lit-
erals in lhe currenl pool are alloclled addresses startina with the CWfflll value in LC
and LC is approprialdy incremelud. Thus, the literals of the program in Fig. 4.8(a)
will be alloc.red memory in two saeps. At the LTOILG statcmcnl. lhe first two literals
will be allocated the addresses 211 and 212. At the EID staaement. the third lileral
will be allocared address 219.
We now praeut the alptbm for the first pass of the assembler. lnteau.e-tiate
code forms for use in a t'M' pass assembler are discussed in the next section.
100 S)'SlefflS Programming & Opmting Sysaems
Variant forms of intermediate codes, specifically the operand and address fields.
arise in practice due to the tradeoff belween processin1 efficiency and memory econ-
omy. These varianas are discussed in separate sections dealing wilh the representation
of imperative statements. and declaration statemenb and directives. respectively. The
information in the mnemonic field is assumed to have the same repraenwion in aU
the variants.
MJlffllOlllclleld
The mnemonic field contains a pair of the fonn
(Slatnn~nt class. code)
where staJ~mffll class can be one of IS. DL and AD standing for imperative state-
mcn~ declaration staaement and assembler directive. 1cspectively. For an imperative
statanen~ cotk is the instruction opcode in the machine language. For declarations
and assembler directives. cotk is an ordinal number within the class. Thus. (AD. 01)
stands for assembler directive number l which is the directive STAAT. Figure 4.1 l
shows the CO'lcs for various declaration statements and assembler directives.
Variant I
The first operand is rcprcscntcd by ,a single digit number which is a code for a reg-
ister (l-4 for ARF.0-DREG) or the condition code ibdf (1-6 for LT-AJIY). The second
operand. which is a memory operand. is represented by a pair of the form
(~rand class. code)
where ~rand class is one of C. S and L !landing for constant , symbol and literal,
respectively (see Fig. 4.12). For a constanl. the code field contains the internal reprc•
scnration of the constant itself. For example. the operand descriptor for the sta1emcnt
START 200 is (C. 200). For a sym~ or literal, the cod~ field contains the ordinal
number of the oper.ind 's entry in SYMTAB or LITTAB . Tbu5 entries for a symbol
IYZ and a literal ~·25· wouJd be of lhc form (S, 17) and (L. JS) respectively.
START 200 (AD,01) (C.200)
RUD i (IS,00) (S.0 1)
LOOP MOVER &R£G, A (IS,04) (l )(S.01 )
Note lhaJ this method of representing symbolic operands gives rise 10 one pe-
culiarity. We have so far assumed that an entry is made in SYMTAB only when a
symbol occurs in lhc label field of an a.uembly staaemen l. e.g. an entry (A. 345. I) if
symbol A is allocared one word ar address 345. However. while processing u forward
reference
MOVER AREG, A
it is necessary to enter A in SYMTAB, say in cnary number n. so that it can be repre-
sented by (S. n) in IC. Al Ibis point. lhc address and length ficl<b of A's entry cannoc
be fiUed in. This implies lha1 two kinds of enttie5 may exist in SYMTAB at any
time-fo r defined symbols and for forward references. This fact should be noccd for
use during error detection (see Section 4.4. 7).
Asscmhlers 103
Variant D
This variant differs from variant I of the intermediate code in that the operand fields of
the source statements are selectively replaced by their processed fonns (see Fig. 4.13).
For declarati,·e slalements and assembler directives. processing of the operand fields
is essential to suppon LC processing. Hence these field,; contain lhe processed f ~.
For imperative slatements. the operand field is processed only to identify literal refer-
ences. Literals are cnteml in LITTAB. and are represented as (L. m) in IC. Symbolic
n:ferenccs in the source statement are not proc.esscd at aJl during Pass I.
START 200 (AD.OJ ) (C.200)
READ A (IS.09) A
LOOP HOVER lREC, A (IS.,04) AREG, A
Comparison ol tm variants
Variant I of the intcrniediate code appears to require extra work in Pass I since
operand fields are completely processed. However, this p~ing considerably sim-
plifies the tasks of Pass U-a look at the IC of Fig. 4.12 confirms this. The functions
of Pass II are quite trivial. To process the operand field of a declaration statement we
only need to refer to the appropriate table and obtain the operand address. Mos1 dec-
lar.uions do noc require any processing. e.g. DC, OS (sec Section 4.4.5). and START
statements. while some. e.g. LTORG. rcqui~ marginal processing. The IC is quite
compact-it can be~ compact as the target code itself if each operand reference
like (S. n) can be represented in the same number of bits as an operand address in a
machine instruction.
Variant II reduces the work of Pass l by transferring 1he burden of operand pro-
cessing from Pass I lO Pass U of the assembler. The lC is less compact since the
memory operand of a typical imperative statement is in the source form itself. On
the ocher hand. by making Pass ti 10 perform more work, lhe functions and memory
requirements of the two~ get beuer balanced. Figu~ 4.14 illustrates the a<lvan-
tages of this ~pcct. Part (a) of Fig. 4.14 shows memory utilization by an assembler
using variant I of IC. Some data structures, viz. symbol table. are ~sed in the
memory while IC is presumably written in a file. Since Pass I performs much more
processing than Pass II. its code occupies more memory lhan the code of Pass ll. Part
UM Systems Programming & Opena1ing Sys1ems
(b) of Fig. 4.14 shows memory utilization when variant D of IC is used. The code
sizes of the two passes are now comparable. hence the overall memory requirement
of the ~mblcr is lower.
~·
Daui
Pass n
Data
Pus)
Dala
Slnlcturcs
Pass u
Data
structures
strue1urcs structures Work Work
Work Work area area
area area
(a) (b)
DC saatement
A DC staaement must be rq,n:sented in JC. The mnemonic field contain.c; the pair
(DL.O I). The operand field may contain the vaJue of the constant in the source form
or in the internal machine representation. No pnx."CSSing advantage exist~ in cithl.-r
(.-.t."ie since conversion of the constant into lhe machine representation is required
anyway. If a DC saatement defines many constants. e.g.
DC '5, 3, -7'
a series of (DL.O I) units can be put in the JC.
LTORG
Pass I checks for the ~nee of a .li&eral refercnce in the operand field of every state-
ment If one exists. it cnten lhe literal in the current literal pool in LITTAB. When
an LTORG Stalemcnl appcan in the source program. it assigns memory addresses lO
the literals in the current pool. These addresses are entered in the addTPss field of
their LITTAB entries.
After performing this fundamental action. two alternatives exist concerning Pus
I processiPg.. Pass I could simply COllSU'UCl an IC unit for the LTORG s&11emeot and
leave all subsequent processing lo Pus n. Values of literals can be inserted in the
target progiam when this JC unit is pi'O<%Ssed in Pass 0 . This requires lhe use of
POOLTAB and LllTAB in a manner analogous to Pus I.
Es■.... 4A Figure 4 .9 shows the U1TAB and POOLTAB for the program of Fig. 4 .8 .al
the end of Pass I. Lla.erals of lhe first pool are copied into the wgct program when the
IC unit for LTORG is encountered in Pass II. Li1mls of the second pool are copied into
the target program when the IC unit for END is processed.
Alternatively. Pass I could itsrlf copy out the literals of the pool into the IC.
This avoids duplication of Pass I actions in P1m IJ. The IC for a literal can be made
106 Sysccms Prognmming & Operating Systems
EXERCISE4A
I . Given lhe foUowing IOW'Ce program:
STilT 100
A DS 3
Ll NOYER AREG, B
&DD AREG, C
NOVDI AREG, D
D !QU A+l
L2 PIJIT D
ORIGI.I A-1
1
C DC 6'
ORIGIN L2+1
STOP
B DC '19'
EID Ll
(a) Show the contents of lhe symbol table at the end of Pass I.
(b) Explain lhe sipifiamce of EQU and O&IGII SWCmcnlS in the prosram and ex-
plain bow Ibey n pnxaleei by die aaembler.
(c) Show lbe IDfelmer!i• ':ock ,_.... for the propam.
Asscmblm lf7
4A.6 PwDoltbeA• +!Mer
Algorithm 4 .2 ls the algoridun for assembler Pass D. Minor changes may be needed
to suit die IC being used. It bas been assumed that the target code is to be wembled
in the area named code..lllWl.
It has been usumed that the assembler produces a WJCI program which is the ma-
chine language of the target computer. This is rarely (if ever !) the case. The assem-
bler produces an obj«I modul~ in lhe format required by a linkage editor ,o r loader.
The information contained in object modules is discussed in Chapter 7.
4.4.7 IJldnaandErrorRapordaa
Design of an error indication scheme involves some decisions which influence the
effectiveness of error reporting and the speed and memory requirements of the as-
sembler. The basic decision is whether to produce pcogram liSling and error reports
in Pass I or delay these actions until Pass IL Producing lhe listing in the fin( pass
has the advantage that the source program need not be preserved till Pus 11. This
conserves memory and avoids some amount of duplicaae processing.
This design decision also has very imponant implications from a programmer·s
viewpoint A listing produced in Pass I can report only cenain errors in1the most
relevant place. that is. against the source slalement itself. Examples of such errors me
syntax errors like missing commas or parentheses and semantic errors like duplicale
definitions of symbols. Other errors like references to undefined variables can only
be reported at~ end of the source program (sec Fig. 4.16). The target code can be
printed laier in Pass JI. however it is, difficult to locale the target code corresponding
to a sowu Slalement and vice vena.. All these factors make debugging difficult
003
009 NVEJl BREG, A 207
••error•• Invalid opcode
010 ADD BREG, B 208
014 A DS 1 209
015
021 A DC '6' 227
••error•• Duplicate definition of ayabol A
022
035 EID
••error•• Undefined •Jllbol Bin atat. .nt 10
For effective error reporting. it is necessary to report all errors against the crro-
Assemblers Je,
ncous stateme ot itself. This can be achieved by delaying p,ogram listing and error
reporting till Pus D. Now the error rq,ons a well as the tarpc code can be priDled
apima each source stateme nt (see Ex. 4 .6).
Ex••• 4.6 Fillft 4.16 iOUSb'llel error repo,tin1 in Pass I. Delcction of emn in mae-
men11 9 and 21 ia IU'liptfo rward. In statane nt 9, the opcode iJ known 10 be invalid
becw•te it does noc nwch wilh any nmemonic in OPTAB. In SlaCCmelll 21. Ais known
10 be a duplicalr definition became an entry for A alrady exiats in the symbol table.
Ute of die undefined symbol B ii harder to detec::f because at die end of Pus I we
ha~ no rec0ld that a forward reference IO B uisu in auaemen1 10. This problem can
be raolved by makina an entry for 8 in the symbol table with an indication chat a
forward rd'aeoc e 10 B exiJu in aaemeo c I0. AU such entries would be pnxesled at
die end ol PISS I to cbcck if a definition ol the symbol has been encountered. If noc.
die symbol table eacry contains sufficient infO"Nrio a for error rq,o,tin1. Note that
the taraet ~ caolJOf t,e prinrcd beca1,c Ibey ~ DO( yec been generated.
The memory address is primed against each llale1'DeDI in a weak aaempt 10 provide a
croa-reference between IOUl'CC statemenll and 1arJCC instructions.
E::-p h 1.7 Fipre 4.17 illUitlik i error reportina performed in Pus II. Indication of errors
in staremeots 9 and 21 is u cay u in Ex. 4.6. hw:ticarion ol the error in swcmenc JO
is equally euy-ch e symbol cable is tearcbcd for an cncry olB and an error is reported
when no mar.chins entry is found. Note lhat 1arp1 propu , imtructims appear apinsa
die mun:e mtemeo ts 10 wtuch Ibey belon1.
003
009 KYER BUG, A 207 + -- 2 209
•• error •• IAY&li d opcode
010 ADD BUG, 8 208 + 01 2 ---
•• error •• 11Ddefi.Ded ayabol 8 iJa operu d field
014 A DS 1 209
015
021 A DC '5' 227 + 00 0 005
••err or•• Duplic ate defilai tion of •Jllbol A
022
036 ElfD