Idoc - Pub - System Software An Introduction To Systems Programming Leland Beck Third Editionpdf PDF

You might also like

You are on page 1of 114
. be) System Software = An Introductivn to Systems Prograinming Third Edition Leland L. Beck San Diego State University An imprint of Addison Wesley Longman, Inc. Reading, Massachusetts « Menlo Patk, Californ ‘Den Mill. Ontario « Sydney + Mexico City York + Harlow, England id » Amsterdam Chapter 1 Background ‘This chapter contains a variety the material presented Inte: software and an overview of dlscussion of the relationships betw ture, which continues throughout in later chapters, References are provided throughout th readers who want further information. 1.4 INTRODUCTION ‘This tex is an introduction to the design and implementation of system soft ‘ware. System sofaate consists ofa vate tion of a computer. Ths software makes thout needing to know the de- were already using ‘wrote programs in « to crete nd aioe Chuper 3 Beco Inlater courses, you probably wrote programs in assembler language. You ay have used macro instructions in these programs to read and write dats, orto perform other higher-level functions. You used an asembe, which prob- bly included a macro processor, to translate those programs into machine lan= fuage. The translated programs were prepared for execution by the loader or ‘put. However, you oncentrate on what you wanted to do, without worrying about hove it was sccomplished. "As you read this book, you will learn about several important types of 33> ‘Chapter 7 contains a survey of some other important types of system soft- editors, and interactive debugging systems. Chapter 8 contains an introduction to software engineering concepts fnd techniques, focusing on the use of such methods in writing system soft “ware This chapter can be read at any time aftr the introduction to assemblers sors contain enough imple ‘these types of software for sputer. Comp ‘om the other hand, are very large topes each has, by itself, been the subject of 1 Sytem Send Machine Arcitare aay complete books and courses. Iti obviously impossible to provide a full coverage ofthese subjects in a single chapter of any ressonable size. Instead, ‘we provide an introduction fo the most important concepts and issues related to compilers and operating systems, stressing the relationships between soft- ‘ware design and machine architecture. Other subtopes are discussed as space ‘permits, with zeferences provided for readers who wish to explore these areas Farther. Our goa sto provide a good overview of these subjects that can also serve as background for students who will later fake more adwanced software ‘courses. This sume appreach is also applied to the othor topics surveyed in Chapter. 1.2 SYSTEM SOFTWARE AND MACHINE ARCHITECTURE ‘One characteristic in which most system software differs from application sof Chapin 1 Backrond dificult to distinguish between those features ofthe software that are truly Fundamental and those that depend solely onthe iiosyn: articular ploce of software ty (GIC). SIC is hypothetical compater that has been carefully 44. Major design options for struchuring a particular piece uf sotware— for example, single-pass versus multipass processing, 5, Examples of implementations on actual machines, stressing wns! software features and thote that are related to machine characteristics. spter contains brief descriptions of SIC and of the real machines ‘hat are used as examples. You are encouraged to reac these descriptions now, and refer to them as necessary when studying the examples in each chapter 1.3 THE SIMPLIFIED INSTRUCTIONAL COMPUTER (SIC) In this section we describe the architecture of 15 The Sinpifl Intructonl Compe (SIC) ike many other products, SIC comes in two versions: the standard model ‘and an XE version (XE stands for “extra equipment,” or pethaps “extra expen- sive"). The two versions have been designed to be upward conpatible—that i, program for the standard SIC machine will also execute properly on XE system. (uch upward compatibility is often found on eal comput that aze closely related to one another) Section 1.1 summarizes the stan- of their lowest ni ‘computer memory Rogistors “There are five registers, all of which have special uses, Each register is 24 bits {in length. The following table indicates the number, mnemonics, and uses of these registers, (The numbering scheme has been chosen for compatbility ‘withthe XE version of SIC) Mnemonic __Number_ Special use a‘ 0 Accumulator used for athmetc operations x 1 Indexregiter sed for addressing L 2 Linkage register the Jump to Subroutine OSU) Tretction sme thera addess inusregisee Re 8 Program counter contains the addres of he ent instruction tobe fetched for execton sw 9 Status word: contains a varity of Information including a Cacition Code (CC) SS a Integers are stored as 24-bit binary numbers; 2's complement representation is ‘values. Characters are stored using their -bit ASCII codes. Instruction Formats ‘All machine instructions on the standard version of SIC have the following 2st format endef wae “The lag bt xis sed to indicate indexed-addressing mode. Addressing Modes “There are two addressing modes available, indicated by the setting ofthe x bit {nthe instruction. The fllowing table describes how the target nres is ealeu- Isted from the address given in the instruction, Parentheses ae used to indi cate the contents of yr a memory location. For example, (X) represents the contents ofregises X. Mode Indication _ Target address calculation Direct = x=0 TAs address Indeed x21 TA=address +0) Instruction Set SIC provides a basic set of instructions that are sufficient for most simple 13) The Simplifd Isewctonal Computer SIC) vided foc sbrotine linkage, SUB mye othe ssbrotine placing the ‘etum adres in egies Ly RSUB rete by jumping fo the adds Cn tained in register L. eee ‘pend gives complet i all SIC a SIC/ Eicon, with ther operation cos anda speciation of the function performed by ech. Input and Output (On the standard version of SIC, input and output are performed by transfor. register A. Each device is /0 instructions, each of whieh SIC. However, the maximum memory 1 megabyte (2 bytes). This increase leads ‘and addressing modes Registers ‘The following additional registers are provided by SIC/XE: Mnemonic Number Special use 3 ase register; used for adress 4 General working reister—no special use 5” General working register—no special use 6 Floating-point acramulator (G8 bits) SIC/XE provides the same data formats 95 the standard version. In addition, ting: point datatype with the following format: "The fraction is interpreted asa value between O and I; that mediately before the higivorder bit. For normalized floating: he high-order bt ofthe fraction must be 1. The exponent is lar memory vlan IC/XE mae ht nas id thus the ingtruction format used on the \dressing, or extend the adress field to 20 Dis. Both ofthese options are inchuded in SIC/XE (Formats 3 and 4 in the fol- lowing description). In addition, SIC/XE provides some instructions that do not reference memory a al. Formats 1 and 2in the following description are ted for such instructions. 1.5 The init instructions Comput (SC) ° Format 2 (2 byte): 5 «1 Mode Inleation Target address catculation Basewlaive b=1p=0 TA=(B)+disp (0 onney WRITE OME BYDE TO cODeUT DavICE Figure 1.6 Sample input and cuput operations fo SiC ‘Output is performed in the same way. Fist the program uses TD to check ‘whather the output device is reauiy to receive a byte of data. Then the byte to bbe written is loaded into the rightmost byte of register A, and the WD (Write Data) instrction i used to transmit ce. Chapter Background u gedgeeee i fa iF i 380 sro RB 14 ‘ation (CISC) Moctine 4.4 TRADITIONAL (CISC) MACHINES “This section introduces the architectures of two ofthe machines that will be used as examples later inthe text Section 11 describes the VAX architecture, and Section 1.4.2 describes the arhitectute of the Intel x86 family of proces. Chepler 1 Background 4s defined separately for each program. A part ofthe process space contains stacks that are available tothe program. Special registers and machine instruc tions aid in the use ofthese stacks. for this purpore, hardovare instructions that implicitly use SP. RI3 isthe frame pointer FP. VAX procedure eal eon- Data Formats 1A Taint (CISC) Mtns per byte. In this forma, the eeparate byte preceding the frst ng mimeic and leading spurte me Instruction Formats struction censists of an operation code (L of 2 bytes) followed by up to six ch operand specifier Addressing Modes VAX provides a lage number of addressing modes. With few exceptions, any ‘used with any instruction. The operand it mode), or is address may be specified by 2 res sina register, th re Instruction Set ‘One ofthe goal ofthe VAN designers was to preduce an instruction st thats symmlric with respect to data type. Many instuction mnemonics ae formed by combining the following elements Chapt Background 1, aprefixthat specifies the type of operation, 2, a suffix that specfis the data type ofthe operands, 3. a modifier (on some instructions) that gives the number of operands Involved, For example, the instruction ADDW2 isan add operation with two operands, ‘a multiply eperation wit 5m ‘operand “SSYIAX provides all ofthe usual type of instacions for computation, data sparson, branching, etc in addition, there are 2 1A Triton (CISC) Machin 1.42 Pentium Pro Architecture ‘The Pentium Pro microprocessor, introduced near the end of 1985, i the latest in the Intel x86 family. ther recent mieroprocessors in £80486 and Pentium. Processors ofthe x86 famly ar presently used in a major- ity of personal computers, and there isa vast amount of software for these processors. tis expected that adaltional generations ofthe x86 family will be developed in the future The various x86 processors differ in implementation details and operating speed. However, they share the same basic architecture. Each succeeding gene eration has been designed to be compat rogremmers usually view the x86 memory asa collection of ‘zymes, From this point of view, an address consist of two parts—a segment toa byte within the segment. Segments can bbe of different sizes, and are often used for diferent purposes. For example, some seganents may contain executable inetactions, and other segments may. bbe used to store dats. Some data segments may be treated as sacks that can be used to save register contents, pass parameters to ubrotins, and for other Purposes. Its not necessary forall ofthe segments used by a program to be in phys ‘al memory. In some cases, a segment can also address specified by the programa is Automatically tronsated into a physical byte address by the x85 Memory Chapter 1 Backgroend Management Unit (MMU). Chapter 6 contains a brie discussion of methods that cam be used in this kind of adress transation, Registers ‘There are eight general-purpose registers, which are named BAX, EBX, ECX, [EDX, FSI EDI, EBP, and ESP. Each general-purpose register is 22 bits long (Le, ‘one doubleword). Registers EAX, EBX, ECK, and EDX are generally used for individual words or bytes from The general-purpose roster set is dente embers ofthe x86 family beginning with the 80386. Tis set is also Segment ze ‘ment and $S contains the address ofthe curent stack segment, The other se5- ‘ment registers (DS, ES, FS, and GS) are used to indicate the eddreses of data segments "Floating-point computations ore performed using a special flating-point tit (FPU). This unt contains eight 80-bit data registers and several other con- 14 Tintiont CISC) Macnee liteendian byte ordering, because the “litle end” ofthe value comes fist in encoded (in binary) Inthe low-order 4 bits ofthe byte: the high-order bits are normally zero. In the packed BCD format, each byte represents two decimal Aligits, with each digit encoded using bits ofthe byte. ‘data formats. The single-preision Chepler? Backrowrd (Operands stored in memory are often specifi using variations ofthe gen- eral target address calculation “TA= (base ester + index restr) * (cae factor) + displacement ster may be used as a base register any general be used as an index register. The scale factor mode). Instruction Set “The xo architecture has a large and complex instruction set, containing more ‘than 400 different machine intructions. An instruction may hee zero, one, Sipaatons and spor conto he poco nd mene management a ” ‘The x86 architecture also includes special-purpose instructions to perform persons fequenty reared in high velpopamming guages for ex “Tipe catering and lenving procedures and Celanese vos age thebound ofan aay Input and Outpus pretixes allow these pertion. 15 RISC Acne 1.5 RISC MACHINES. “This section introduces the architectures of Uwee RISC machines that will be used as examples later in the text. Section 15.1 describes the archilecte of the vantagesand disadvantag 1.5.1 UltraSPARC Architecture “The UltraSPARC processor, announced by Sun Micro lntast member of the SPARC family. Other members of variety of SPARC and SuperSPARC processors. The of iemented by a number processor architecture ‘ange of implementa: Chapter Baked Memory Memory consists of S-bit bytes; all addresses used are byte addresses. Two consecutive bytes form a afford four bytes form a word; eight bytes form a Goubleword. Haliwords are sloced in memory beginning at byte address that fare multiples of 2 Srlanly, words begin at addresses that are mulliples of 4, [and doublewords at addrestes that are multiples of. ‘UllraSPARC programs can be written using a virtual address space of 26 bytes, This addeas epace is divided ito pages! multiple page sizes are sup- ported. Some ofthe pages used by a program may be in physical memory, while others may be stoned on disk When an instruction is executed, the hard "ware and the operating system make sure that the needed page i loaded into ‘be used inthis kind of address tranlation. Registers lap, So some registers in the ‘example, registers 18 through 115 of a calling procedure are physically the fame registers as 124 through 131 of th called procedure. This facilitates the passing of parameters “The SPARC hardware manages the windows into the register file. fa set of, concurrently running procedures needs more windows than are physically “window overflow” interrupt occurs. The operating system must then save the contents of some registers inthe fle (and restore them later) 0 15) RISCMackines Besides thes register files, there area program counter PC (Which contains the address of the next instruction to be executed), condition code registers, land a mumber of other control registers, “There are three basic instruction formats in the SPARC architecture. All of ‘word 'SPARC architecture is typical of RISC proces of instruction fetching and de- ih the complex varisbleength insirsetions found on CISC eystams such as VAX and 386 Adtdreeaing Howes As in most architectures, an operand value may be spcifed as part ofthe in struction i te mois), or it may be in register slirect Register indirect indexed TA (register) + (egiter2) 1eis used only for branch instructions. iy few addressing modes of SPARC allow for mor ef Instruction Set ‘ordinary branch instruction following the branch 15. RISC Machines ‘common operating system functions. Communication in & multi-processor may allow a compiler to eliminate many branch instructions in order to opti- mize program execution, Input and Output In the SPARC architecture, communication with I/O devices is accomplished tecture include the PowerPC 601,608, andl 604 others are expected inthe near future ‘As its name implies, oweePC is a RISC architecture. As we shall oe, thas alford: four bytes fers ight bytes form a fauna vor, Many istracion may ena (Capler_ Backgront more efficiently if operands ae aligned at starting adress that isa multiple of thei lengt PowerPC programs can be written using @ virtual address epace of 64 bytes This adress spaces divided into fixed length spre, which are 256 rogebytes long. Bach segment is divided into jugs, which are 4096 bytes ‘others may be stored on disk. When land the operating system make s Registers be used to store and manipula .putatons are performed using a special, nisin thst vo 64-bit floating-point regi 15) RISC chins ‘There are two diferent floating-point data formats, The single-precision ‘sores 23 significant bits ofthe floating-point value, hoper 1 Bacgront ‘Branch instructions use one of the following three addressing moves Mode ‘Terget address caleuation ‘Absolute TTA= actual addrese Relative ‘TA. current instruction addres + displacement 25bits, signed} Link Register TA=(LR) Count Register TA=(CR) “The absolute addres o displacement is encoded as part ofthe instruction, Instruction Sot ‘The PowerPC architecture has approximately 210 machine instructions, Somme 15 RISC Machines ‘A reference to an address that snot in diroct-store segment represents 2 orm virtual memory access. In this situation, 1/0 is performed using the regular virtual memory management hardware and software 1.5.8 Cray TSE Architecture “The TBE series of supercomputers was announced by Cray Research, Inc, near 5 (MPP) system, de- Memory Each processing element in the TSE has ts owe local memory with a capacity of from 64 mepabyts to 2 gigabytes. The local memory within each PEis part Chaplet Background ofa phypiclly dstibuted, logically shared memory system. System memory \s physieally distributed because each PE contains local memory. System mem- (ory logically shared because the microprocessor in one PE can ‘memory of another PE: ‘The memory with ‘processing element consists of 8bit bytes; all addresses used are byte addresses. Two consecutive bytes form a wont; four Alpha instuctions tasting adress that addresses Registers 1s RISC Metin Instruction Formats ‘There are five basic instruction formats in the Alpha architecture, some of before this fixed length is typical of RISC systems.) Th struction word always specify the opcode; some instruction formats also have ‘an additional function” field Mode ‘Target address calculation PC lative TTA= (PO) + displacement (23bits, signed] Register indirect sath Register indizect with displacement mode is used fot load and store opera tions and for subroutine jumps. PC-rlatve mode Is used for conditional ana ‘unconditional branches, Instruction Sot pprosch can be found in Sach and Weiss Input andi Output “The TSE system perfceme i/ throvgh multiple ports into ane ot more 1/0 hannels, which can be configured ira murnber of ways. These channels are Chapter) Bakr integrated into the network that interconnects the processing nodes. system, ‘may be configured with up to one 1/0 channel for every eight PEs. All chan- EXERCISES Section 1.3 41. Write a sequence of instructions for SIC to sot ALPHA equal to the product of BETA and GAMMA. Assume that ALPHA, BETA, and GAMMA ave defined as in Fig. 1.) quence of instructions for SIC to set ALPE tion of BETA » GAMMA. Ascuave thal GAMMA, setting ALPHA [DELTA to the remainder. Use rgisterto-epstrinstvetions to make ‘8 sequence of instructions for SIC/XE to divide BETA by IA setting ALPHA to the value of the quotient, rounded to the nearest integer, Use regster-t-register instructions to make the 6, Wiite a sequence of instructions for SIC to clear a 2n-byte string to all blanks. 7. White a sequence of instructions for SIC /XE to clear a 20-byte string to all Banks. Use immediate addcessing and reister-to-registerin- elements ofthe array to 0. Use immediate addressing and register to- register instructions to make the process a efficent as posable, Suppose that RECORD contains a 100-byte recor a in Fig. 17() ‘Waite a subroutine for SIC that wil write this record onto device 05 RECORD contains a 10Dyte record, as in Fig. 1706) ne for SIC/XE tat wall write this record onto device late addressing and register to-egister instuctions to make the subroutine as eicient as posible able named LENGTH. Use immediate addressing and registe-to- register instruction: to make the subrovtine as eifcient as posible | Chapter 2 Assemblers i In this chapter we discus the design and implementation of assemblers. These an ADD operation. As we shall ee, there are also ssany subtler ‘ways that assemblers depend upon machine architecture. On the other hand, the corresponding tered machine independent assem! tation. Onee again, ou purpose piions But rather hk? Ascenders concepts and techniques that can be used in new and unfamiliar hich each might be useful. ‘Section 25 we briefly consider some examples 2% machines, We do not attempt to discuss all aspects of these as- etal. Instead, we focus on the most intresting featres that are introduced by harduvare or software design decisions 2.1 BASIC ASSEMBLER FUNCTIONS owing essenBler diztoes: START Specify name and starting address forthe program. END 3 the source program and (optionally specify 22, Basic Asem Functions 6 Chapter 2 Aso put device, Each subeoutine must transfer the record one character at time because the only 1/0 instructions available are RD and WD. The butler is nec- tes for the two devices, such as a disk and a slow ‘operating system calls on » SIC/XE system to accom= ions) The end ofeach ecord is marked witha ull charac Ifa record is longer than the length of the buffer (4096 tes by executing an fgram was called by the operating system using 2 SUB instruction; thus, the [RSUB wil tum control othe operating system. 2.1.1 A Simple SIC Assembler in Fig. 21, withthe generated object dress raslaton of source program to abject code requires us to accomplish ‘he following functions (not necessarily in the order given): 1. Convert mnemonic operation codes to their machine language equivalents eg, translate STL to 1 (line 10) 2. Convert symbulic operands to their equivalent machine aldresses— eg. taralate RETADR to 1033 (Line 10) ‘3. Build he machine instructions in the proper format. 4. Convert the data constants specified in the souree program into theit intemal machine representations—eg, translate BOF to 454846 line 80), 5, Wit the bject program and the assembly isting, gags BUSRAGRGSESS® Figure 22 Program from Fig. 2.1 with cbject code, Chapter 2 Accents “Thi instruction contains a forward reference—that (RETADR) that is defined later in the program. If ‘rogram line by line, we will be unable to process this statement because we fo not know the adress that will be assigned to rogram. The first pass definitions and assign “The second pass per form: most ofthe actual translation previously described. the source programy the assem= the object program, and RESB and RESW, serve memory locations without generating dat in our sample program are START, (dress forthe object program, and 21. Basie Acne Factions Co. 2-7 Stating adivess for object exe in this recorhexadecimal) Col.8-9 Length of object code inthis recordin bytes (nadia) Col. 10-69 Object code, represented in hexadecimal @ columns per byte of abject cade) cau E CaL27 Addn fit inal itacton cba progam (hexadecimal) ls, the program during execution, (Chapter 3 containea detailed discussion of the ‘operation ofthe loader) 'wecan now gvea general description fhe functons ofthe wo pases of our simple assembler. Figure 2.3 Object program corresponding to Fig. 22. hepler 2 Assanies ass 1 (define symbols: 1. Assign addresses toall statements in the program. 2, Save the values (addresses) assigned toll labels for use in Pass 2. 3. Perform some processing of assembler directives, (This includes processing that affects address assignment, such as determining the length of data areas defined by BYTE, RESW, ec) ‘Paes 2 (assemble instructions and generate object program): ting operation codes and Iooking, ned by BYTE, WORD, ee rocesing of assembler directives not done during Pas 1. In the next section we discuss these functions in more detail, describe the in- ternal tables required by the assembler, and give an overall description of the logte flow of each pass 2.1.2 Assembler Algorithm and Data Structures ample assembler uses two major internal data structures: the Operation {GYMTAB), OPTAB is used to look 21. Basic Asonber Futons Likewise, we must have the information from OPTAB in Pass 2 to tell us {able—that is, entries are not nermally added to oF del cats ts posible to design a pei hashing anton formance for the particular set of keys being stored. Moat a general-purpese hashing method is used. Puther i lesign and construction of hash tables may be found in ‘good data structures tox, such as Lewis and Denenberg bled instructions. deletion is not an important Loughout the assembly, care ing copy of the source program can also be used to relsin the reels of era 22 Machine Dependent Assembler Fats may be performed during Pass 1 (such as scanning the sbols and addressing flags), so these need n during Pass 2. Similay, pointers into OPTAB and SYMTAD may be retained for each operation code and symbol used, This avoids the need to repeat many ofthe table-sarching operations. the logic flow of the two passes of our assem- ly urged to follow through the logic in by han to the program in Fg. 2] to produce 2.2 MACHINE-DEPENDENT ASSEMBLER FEATURES Figure 2.4(a) Aigorv or Pass 1 of assombier. a decom = yell fie i Chapter? Asses Figure 2.4(0) Algorthm lor Pass 2of assembler 22 Machine Dependent Asembler Fates Source statement te Eom Teor ron fF (ONO = 0) soe jenn ‘tre comic nae = me Figure 25 Exampl of a SICME orogrom Chapter?” Assis (Gee line 70), Immediate operands are denoted with the prefix # lines 25, 55, 1158) Instractions that refer fo memory are normally assembled using either relative or the base relative mode. The assembler dirce- se relative addressing. Gee li changed from TIX MAXLEN urn” operation on ine 70) You may nc- for another instruction (as in tice that some of the changes require the addition of other instructions to the program, For example, changing COMP to COMPR on line 150 forces us to del the CLEAR instruction on line 132. Ths sil results in an improvement in fexec:tion speed. The CLEAR is executed only once for each record read, ‘whereas the benefits of COMPR (as opposed to COMP) are realized for every 22, Machine Depot Assn Fetes Implications for operating systems, in Chepter 6} To take fll advantage of ‘specifying a fixed adress at assembly time. ‘of program reaction and discusses its impli- ‘nstaction assembly, how re really to be loaded at snbled using either esembler must in address field, whichis large enough to contain the full memory address. In ‘ths ease, there sno dieplacerent to be calculated. For example inthe instruc tion 350006) corm ROR an1nn036 22. Mechine Dependent Ascwbler Fores 8 Line Lee Source statement Object code the operand address is 1036. This full address is stored inthe instruction, with biteset to Ito indicate extended instruction format. [Note thatthe programmer must epeify the extended format by using the prefix + (as on line 15) If extended format isnot specified, our assembler fist i gua f i i 1838 i bam He fo36 i f Se Hae setting ofthe addresaing mode bit #3 ) Another example of program-counter relative assembly. the same as for progre counter relative addsessing, The malin cterence Figure 26 Program trom Fig. 25 with object code. aot - i | ceding instruction (LDB #LENGTH) loads Chapter? Assos that the assembler knows what the contents ofthe program counter will beat execution time, The base register, on the other hand, is under control of the programmer Therefore, the programmer must tell the assembler what the base register will contain during execution of the program so thatthe assem ler ean compute displace the assem - bler dieective BASE. The assembler thatthe base reg program execution. The assembler assumes wunters another BASE statement, Later the contents ofthe base register ean no longer be relied upon ne, important to understand that BASE and NOBASE are assembler dire \dace no executable code, The programmer must provide i Fath proper value into te bose register Gung exeaton. I this roperly, the target addres calculation will not produce the correct gms OPER, $70003 fof base relative assembly. According fo the BASE state- "Notice the difference between the assembly ofthe instructions on lines 20, ‘On line 29, LDA LENGTH is assembled with program-counter rela: 22. Mahe Dependent Ase etree 5 ooo im 8 10003 ype ample of is, wih he operand tol nthe isto 80 and bit set to 1to indicate immediate addressing. Another example can be found in the instruction =i — 131 ane or 4096 Ta this cose the oper used) 12003 ue fiaont 622020 donot know exaly wien jobs wil be submited, eval ow long they wl Chapter? Assobions run, etc) Because of this it is desirable tobe able to load a program into mem toy wherever there is room for it. In such a situation the actual starting, d= dress of the program is aot know uni lod time. “The program we considered in Section 21 is an example of an absolute ‘program (or absolute assembly). This program must be loaded at acdvess 1000 {ihe nares that was apecified at assembly time) in order to execute prope. ‘Tosoe this, consider the instruction from Fig, 22. In the object program (Fig. 22), this statement is translated as is tobe loaded from memory 020. other hand, there are parts ofthe program (such asthe constant 3 gener from line 85) that should remain the same regardless of where the {UB instruction would nee to be changed sy adds of RDKEC, Selon 22 MackinsDependntAsenBlerFstes (usu ROREC) (ovsue ORE) sue oneS) Figure 2.7 Examples of program loaton, [Note that no matter where the program is Joaded, RDREC is always 1035 bytes past the starting address of the program. Ths means that we can solve the relocation problem i the following way: 1 hn in me gaat ct ode te BUB sr Chapter? Assemblers Seti 22. Mache Drpendot Asse Bier Fetes 6 “The command for the loader of course, must also be a past ofthe object pro- bytes away from the STL instruction; thus no instruction modification is ‘gram. We can accomplish this with » Modification record having the following format ‘Modification recor cot M CoL2-7 Starting location ofthe adres fi modified, re- ative to the beginning of the pr sadecimal) CoL8-9 Length of the address field to be modified, in hal: bytes (hexadecimal than bytes) because the address record would be we0000705 cd spaifes thatthe beginning adres Figure 2.8 Objec rogram comesponiing o Fl. 26. ss Chapter? Asbo: 2.3 MACHINE-INDEPENDENT ASSEMBLER FEATURES fe discuss soawe common assembler features that are not ‘of such capabilites is much more closely related mer convenience and softwar yment than it is to Section 23.1 we diseus the implemen Die, including the requited data structures and processing logic. Section 2.3.2 language notation, a literal i identified with the prefix =, le, sng the sae nota specifies a Sbyte opersnd whose valve i the character string EOP, Likewise ‘he statement ms 10 Or =H E201 Saggy ga50 8 Gogeiges sana tgiy i ane 22 Machine pod Asan atures o Source statement So “porn fees mS Eo bores-nerm Zee frre igure 2.10 Program from Fig. 2.8 with abjoct code the target address forthe machine instrac- exactly the same as ifthe programmer had ‘would be placed int begin at address 1073. This means operand would be placed too The need for an a= desirable to keep te same literal used ie copy ofthe speci- fied data value. For example, the literal =X05'is used in vur program on lines Idee sei Chaps 2 Avemblrs 215 and 220, However, nly one data aren with this value fs generated, Both ‘lust appear in the literal pool. The same problem arises if a litecl refers to fy olher item whose value changes between one point in the program and another ‘Now we are ready to describe how the assembler handles tral operands. tera table LITTAB. For each literal used, ‘Operand value and length, and the ad ‘placed in literal poo. LITTABis of- 25, Machine depen Aerie Fetes signed, the location counter updated to reflect the number of bytes occupied by each literal During Pass 2 the operand address fr use in generating cbject code i ob- tained by searching LITTAB for each literal operand encountered. The data as if these values had been generated al value represents an address in the tion counter value), the assembler must also gon- ‘understand how LITTAB is creatod and used by the asem- apply the procedure we just described to the source ‘object code and literal pools generated should be for improved readability in place of numeric values. For example, on line 133, ‘of the program in Fig. 25 we used the stalement ‘There is another common assembler directive that canbe use indirectly assign values to symbols. This directive is usually called ORG (for “origin”. Its fran is inthe program, wecan write line 133 as Chpler?- Asembrs | 213) Machine Indpndnt Ace rts *s ! ms omy RL cm ay OR samo, mo sus ma ay OR vaMe RU SABC EA Gupte cenit ‘This would allow us to write, for example, ion VALUE. to fetch the VALUE fie from the table entry indicated by the contents of reg- ister X. However, this method of definition simply defines the labels it does ro make the structure of the table might be. on wsing ORG in the following orp Ress 1100 snmo. Resa 6 one Snan+2300 “The first ORG resets the location counter tothe val ning address ofthe table) the ale ef the new symcl—mast have been defined prev usin the program Thus the equence mew REM 2 BETA. QU. ALPHA 23. Machine independent Ase Fetes ‘would be allowed, whereas the sequence symbol definition process. In the second a value when iti encountered dur- EPHIA does not yet have 2 valve). hs at the atenbler wou ot kw (ting ‘of handling such sequences in a more complex assembler strc 23. Expressions * Chapter? Asenbrs ‘use of expressions wherever such a single operand is permitted. Bach such ex- ‘pression must, of course, be evaluated by the assembler to produce a single ‘generally allow arithmetic expressions formed according to and /. Division is usually defined term represents th Fig. 29 the siatem« 106 RD a ‘assembler directive) may be either an ab~ ive term depending upon the expression used to define terms may enter intoa miltipliation or division operation. vA relative expresion is one in which all ofthe relative terms except one can be paired as described above; ining tnpaited rolatve term must term may enter into a multiplica- tion or division operation, Express te or eelative expressions should be flagged by the assembler sgh the rules given above may sven arbitrary, they aze actually quite 2] under these definitions include es- oe remains meaningful when the program is relocated, A reat esson represents some value that may be ‘Written as (+1), where 5 is the sarting address of the program and r is the 23. Meine ndpoen! Asner atures ‘value ofthe term or expresion relative to the starting address. Thus a relative ‘the program of ig. 29, In the statement 207 a DBR BLER both BUFEND and BUFFER are relative terms, each representing an address ver, the expression represents an absolute value: the ‘addresses, which isthe length of the buffer area in sociated with the symbol that appears in the + BUFFER, 100 ~ BUFFER, or 3 * BUFFER land generate Modification vocoeds in the objoct ‘der programs that consist of several parts that can cof each ather. As we discuss in the Inter section, ‘our rules fa determining the type of an expression must he modified in-sich instances, Chapter? Assemblers 2.3.4 Program Blocks Inall of the examples we have seen so far the program being assembled was treated a5 a unt. The source programs logically contained subroutines, data areas, ete. However, they were handled by the assembler as one entity, resull> ing in a single block of object code, Within this objct program the generated sachin instructions and data appeared in the same oder ashy were writ ‘igure 21 ram blocks in ts eae yee Block ae sed The is onaared) program block contains the executable instrctions ofthe program. The second (named ‘EDATA) contains all data areas that are a few words or less in length. The third (named CELKS) contains all data arcas that consist of larger blocks of ‘memory. Some possible reasons for making such a division are discussed later {inthis section. ‘The asseubler directive USE indicates which portions of the source pro- gram beng he vat locks, tte epi of he prorat be part of the unnamed lock; if no USE. Statements te included, the entre program belongs to this single block. The USE statement on line $2 signals the Beginning ofthe biock named CDATA. Source statements are associated with this block until the USE statement on ‘continuation of a previously begun block. Thus the statement on resumes the default block, andthe statement on line 283 resumes the TA, each progrem block may actually contain several separate ‘our program. The assembler will logically) rearrange these gather together the pieces of each block. These blocks will then be fsegned aedruses in the object program, with the blocks appearing in the 23. Mac depontow Atco Fetus Capi? Assemblers ‘same order in which they were fist begun in the source program, The results the same as if the programmer had physically rearranged the source state- _menis to group together all the source lines belonging to ach block. bal able, the block name or number is stored along with the assigned relative ie ofthe location counter for each block indicates the length of that block The assembler can then assign to each block a starting address in the object program (beginning with relative Joes ‘ion 0). For code generetion during Pass 2, the assembler needs the adress for the object program (not the start ofan indi- ily found fm the information in SYMITAB. location of the symbol, relative tothe start of re (0 = default bloc mation that is stored in SYMAB for each symbol: symbol MAXLEN (ine 107) is shown without « ‘that MAXLEN is an absolute symbol, whose value snot relative tothe start of any program block. the end of Pass? the assembler constructs a table that contains the stat ing addresses and lengs forall blocks. For our sample program, this table Joo ike Blockname Block number Address _Length (efaut o 0000066 pata i 0065 0008 cus 2 oor 3000 [Now consider the instruction 20 9006 © yon ure 32060 or aagaggesyosee yagegag casi Ee] Sagang poqarage gage a a i Figure 2.12 Program from Fig. 2.11 with obec ode. Object code CChaptr 2 Asonblers 'SYMTAB shows the value ofthe operand (the symbol LENGTH) as relative lo- ‘ation 0003 within program block 1(CDATA). The starting adress for CDATA 4s 0066, Thus the desired target addeess for this instruction is 0003 + 0066 = (0069, The inetruction isto be assembled using program-counter relative ad> location 0000, this address is simply 0008. Thus the required displ (069 — 0009 = 60. The calculation ofthe other addresses during Pas similar pater. ‘We ean immediatly oe thatthe separation of the program into blocks has in the program is also much more easily solv statement in the CDATA biock to be sure ta any lage data areas. (Of course the use of program blocks has not accomplished anything we iscusced, machine considerations suggested thatthe suggested that the source program should be in a f program blacks is one way of satisfying both of these requirements with he assembler providing the required reorganization, Tis not necessary to physically rearrange the generated code inthe object ‘rogram to place the pisces ofeach program block together. The essembler can Simply write the object code as it fs generated during Pass 2 and inset the proper load address i each Text record feflest the stating address of the block code within the block. This process is illustrated in Fig. 2.13. The frst two ext re generated from the source program lines 5 through 70. When the USE statement on line 92 is recognized, the assembler wtes out the current ‘Tet record (even though there ssl room left in i). The assembler then pre- ‘pares to begin a new Text record fr the new program block. As it happens, the $tatements on lines 95 through 105 result ino generates 23. Mahine Independent Asner Features % Figure 2.13 Object program corresponding to Fig. 2.11 records are created. The next twa Text records come from lines 125 through {in the generation of objct code. The fifth Text record contains the single byte resumes the default program block ‘how the assembler handles moiple program pieces of exch program block ave gathered (get subroutines or olor logis! subdivisions of 2 progiam. The programmer can asvamnble, oad, and manipulate each ofthese contol sections separasls, The Chapter? Asenbes Secin23. Mase denn Asse Feats 8 : Program forded Figure 215 shows our example program ast might be writen using malic | ‘Source program Object program TR mOMOY e ple control sections. In this case there are three control sections: one forthe ss tiie Line Sot bc os contol | 1 This et ota Loon at fssember dircive signals the start of «new control section named RDREC. ono over Similarly, he CSECT statement on line 183 begins the contol section named owns | oa gi | COnTA Jom | cour a eta) s oat | ‘| ous soar fo | SONAR) ns conraey | i ta ‘onrnar | ZI cor | ses) COATAR couxsen ota) s {ine 207). The order in which symbols are listed inthe EXTDEF and EXTREF 2 coATASY statementsis not significant hore ‘Now we are eady to look at how extemal references are handled by the Figure 2.14 Program blocks from Fig. 2.11 traced through the assem ‘bly and loading processes. resting flexibility ea major benafit of using contro sections We consider ex- amples ofthis when we diseus linkage editors in Chapter 3. ‘When cortzo sections frm logically related parts ofa programy itis neces- ‘operand involves an external reference ‘cusses in detal how the aca linking i performed, fe 30 beer eee, { sahegdagea®sggey i { sangue sess" dageheayy : 5 i i 5 i pearly i 8 i i a Figure 2.16 tstration of contol sections and progam linking Figure 21 Program rom Fig. 2.18 wih objec code, makes an external refezence to BUFFER, The instruction is extended format with an address of zoro. The x bit ddexed addressing, as specified by te instruction. The 2900028 MUGEN ORD OPD-BUPPER fx word to be generated is specified by an expression involving ber stores this value between the handling ofthe expression on line 190 and line 107, The symbols BUFEND and BUFFER are the values of external symbols. The fn the object program that will aus where they ae required. We need two new record types in the object program and a change ina previously defined record type. As before, the exact format ofthese records i arbitrary: however, the same information must be passed to the lor in some form. symbols that are used as , symbols named by EXTREE Section 23 Machine terete Assn ators cot D Col.2-7 Name of external symbol defined in this control section CoL8-13 Relative addrese of symbol within this control section (hexadecimal) Col, 14-73 Repeat information in Col. 2-13 for other external symbols Refer recor Colt R CoL2-7 Name of external symbol referved to in this control, section CoL8-73 Names of other reference symbols ‘The other information needed 1m linking is added to the Modification record type. The new forme follows. Moalfcaton record (revised) Colt M CoL2-7 Stating addres ofthe fold to be mortified, eatve to the beginning ofthe conteo section (hexadecimal) Col 8-9 Length ofthe fel to be modified, in hal-ytes thera decimal) Col 10 Modification fag (+ or) Col 1-16 External symbol whose value isto be added to or sub- tracted from the indicated field “he first three items inthis record are the same as previously discussed. The tro new items specify the modification to be performed: adding or subtact- 3 symbol. The symbol used for modifieation may through End) for each control section. The records foreach control section are ‘exactly the same as they would be ifthe sections were assembled separately ohtrol Section: For EXTREF symbols, ‘These symbols are simply named inthe Refer recor

You might also like