Myers

Ms .4 | Requisites for Improved Architectures G.d. Myers Mhativeted by the problems in conventional architectures. 2 set of solutions can be developed to shrink the semantic gaps discussed in Chapter 2. These solutions are not as simplistic as increasing the “power” ‘of a machine's instruction set (e.g.. by adding a DO-loop control instruction to a conventional architecture}; rather. they involve radical changes to the storage concept of an architecture and thus represent departures from the classical von Neumann model. The concepts also involve increases in the power (function) of the machine instruction set. but this is 1 secondary fallout effect from the changes to the concept of storage. In this chapter the concepts will be discussed in a general way, without dwolling much on specific tepresentations or their implementation. However, as the SWARD architecture (discussed in detail in Part V) contains mast of the concepts discussed herein. it is used in this chapter ‘as an example of many of the concepts. The reader should refer to the case study architectures in the subsequent parts of the book for further illustrations of these concepts. ‘SELF-OEFINING DATA 'A significant step in closing the language/machine semantic gap. and @ lacge departure from the classical von Neumann modol, is the idea of making all information within storage self-identifying (also known as the concept of tagged or typed storage). In the von Neumann architecture the characteristics of an item in storage are subjective, existing only in the ‘eyes of the beholder (a program). The machine derives its data-type information solely from the instruction stream. That is, it is the instruc tion that defines the attributes of the operands (e... the $/370 has distinct instructions for adding 32-bit binary integers. 16-bit dinary integers. 32-bit floating-point numbers. 64-bit floating-point numbers. 128-bit floating-point numbers. 1-digit decimal numbers, 3-digit decimal numbers, and so on}, The alternative is to add a set of bits to each cell in storage. which describe the attributes af the cell. as shown in Figure 4-1 (The term cell is used instead of word as word often denotes a fixed-size tunit of storage. a restriction that we do not necessarily wish to imply.) ‘At the minimum, the self-identification field (or tag) identifies the data type of the cell (i.e. bits in the tag would denote whether the value field represents a binary integer. a decimal integer. a floating-point aumber, @ character string, an address, etc.). One result is that the machine instructions can be defined generically. Rather than having a large repertoire of ADD instructions. the machine need have only one; the type of addition to pei oy Creer Acie” am Aan Compe Ahr Gi Myers 100, pegs SEIGD,Copyngn © 12 in Whey & Som Ie apr arminn kn Wiy Sos ”SELF-DEFINING DATA be performed is deduced by the machine by examining the tags of the operands of each instruction. The tag fields are ‘transparent to the high-levei-language programs. The tags are established by the compiler (or similar function) and are used by the machine to determine the semantics of each operation to be performed, for consistency checks (e.g.. the architecture might be defined such that the machine would refuse to add an address to a floating-point number), and for automatic data conversion (e.g., the architecture might be defined such that if an ADD instruction refers to an integer operand and a floating-point operand. the addition is performed based on established data conversion rules}. In addition to identifying the type or representation of the cell (Feustel suggests 32 possible types {1)). the concept of tags can be exploited further by the architect. For instance, ifthe data can be of varying length. a field could be defined in the tag to designate the Jength of the operand. Among other things, this would eliminate the repetition of the length information in each instruction that references sach data word. For instance, the $/370 has 15 distinct ADD instructions. One of these, ADD DECIMAL. contains two 4-bit fields specifying the lengths of its two operands. In a sense, then, the $/370 has 270 distinct ADD instructions (14 plus 256 varieties of ADD DECIMAL). In a tagged architecture where each tag describes the type and length of the operand, there would be a need for only one ADD instruction. Once the decision has been made (o employ tagged storage and generic instructions, other possible extensions of the tag come to mind. One could add a 1-bit field describing whether the cell's value is currently defined or Ta vane Figure 4.1 Setidendtying storage word, not, thus allowing the machine to detect attempts to use an undefined value. (The Intel 8087 numeric microprocessor (2 uses a 2-bit tag for this and other purposes.) One could also add trap bits to the tag. For instance, abit might be defined such that whenever a program references this cell, a machine interrupt is generated, thus providing a base for sophisticated debugging tools. Figure 4.2 illustrates four of the cell types in the SWARD architecture of Part V. In this architecture, tags have different lengths and formats, depending on what they are representing, but the first four bits always define the type of cell. The dividing point between the end of the tag and the remainder of the cell (the content) is indicated by an arrow in the figure. In this architecture the tag of a cell (with one important exception— “dynamically typed” cells) is invariant during program execution; hence the architecture is oriented taward strongly typed languages (.e.. languages in which each variable has immutable attributes). Cells do not exist independently in storage; rather. they are contained within various types of objects. An object might consist of a single cell, a set of cells. cells, and machine instructions, or a number of other possibilities. Rather than use the tag to represent the undefined value, undefined is a unique valid value of all cell types. To illustrate one cell type, the fixed-point cell represents a value in bass 10 with an integer and a fractional part. The tag defines the cell as fixed point, the number of BCDPO REQUISITES FOR IMPROVED ARCHITECTURES digits in the cell. and the position of an imaginary decimal point. For instance, suppose that one defined a cell to hold a value of the form t a t nun SEES 4 owewsee [on] on we oa x t ooren See exus Figure 4.2. Represantative cell types in te SWAAD architecture. XXKX.XX. If the value 5.7 were assigned to the cell, its representation in storage would be £620000570 (in hexadecimal notation). If the cell had the undefined value, its appearance would be E62FXXXXXX (X = dan't care). Advantages of Tagged Storage Error Detection. ‘The storing of attributes with data allows the machine to detect several classes of programming errors. For instance, the erchitec- ture can be defined such that errors are triggered if the operands of an instruction are not meaningful (e.g.. an operand of a multiply instruction is a charactor string). incompatible (e.g.. a program is attempting to store a floating-point number into an address, or if a source operand has an undefined value. Such an environment fs called a type-safe environment, since it probibits operations on improper data types. Automatic Data Conversion. Tagged storage enables the machine to perform automatic date conversions if the operands of an instruction are compatible but have different lengths or representations. For instance. the architecture can be defined such that it (s permissible to add an integer to «a floating-point number, and the machine will perform @ data conversion if it fetches an ADD instruction with these operands. As another exampl one could specify two fixed-point values with decimel points in different positions as the operands of an instruction, and the mechine would perform any necessary logical alignment of the decimal point during the processing of the instruction. Execution Efficiency. Tagged storage can also benefit the execution speed of a system. This stems in part from the two advantages above. In an untagged machine data conversions. if needed. must be performed by code generated by the compiler. Automatic data conversions in a taggedSELFOEFINING DATA machine should always be faster than those done by compiler-generated code. if for no other reason than the saving accrued from the need in the untagged machine to fetch and decode the conversion instructions. ‘Tagged storage can also benefit execution speed by providing the machine designer with algorithmic benefits. For instance, consider the function of incrementing a 15-digit BCD number by 1. Most machines that, provide a variable-length BCD representation do the arithmetic by small increments (e.g., byte ata time). On paper. one would stop the operation, as soon as the carry was zero (e.g., add 1 to the low-order digit. Ino carry, stop; else add the carry into the next digit, and so on). The $1370 architecture, an untagged one, requires that the result of a decimal operation be a valid BCD representation. Since the machine cannot guarantee that a decimal operand in storage contains BCD digits (e.g.. the program could have stored an arbitrary byte value anywhere in the storage area), the operation must always be performed throughout the entire operand (e.g.. 15 digits). This would not have to be the case in a tagged machine. since the architecture could be defined such that itis impossible for a decimal cell to contain anything but BCD digits. Hence the protection of data representations provided by tags might be of advantage to the designer of the processor. ‘Tagged storage defers the binding of instructions to data attributes until the time of instruction execution and hence can be expected to provide execution-speed benefits whenever a high-level language is used which requires this. For instance, in some circumstances PL/I requires execution-time binding of attributes. A circumstance is shown in the following program fragment: ABC: PROCEDURE (C); DECLARE C CHARACTER(.); DECLARE D CHARACTER(A) INITIAL(ZYXWVUTS)): IF C=D THEN... Here formal parameter C was declared as having a dynamic size; that is. it acquires the size of the corresponding argument. Because of this. the compiler cannot generate the single comparison instruction for the [F statement that it would normally generate on an architecture such as the 8/370. Instead. it must essentially create a “software tagged-architecture” environment by passing the size of the argument during the procedure call and using this value dynamically to compare C with D. This is further complicated by the fact that the semantics of the comparison are dependent on whether the size of C is equal to, greater than, of less than that of D (Le., to determine whether to truncate or add implicit blanks to C). This circumstance, where the compiler must create an environment similar to that of tagged storage. causes the C=D comparison above to result in the generation of not one, but 49 instructions {IBM's PL/l optimizing compiler on the 370). This situation applios to other languages, such as Ada. Fewer Instruction Types. Because a tagged machine normally has generic instructions, it should have considerably fewer instructions than its untagged counterpart, The significance of this is that the architecture to be highly regular, lacking many of the anolmalies that seem to creep into large instruction sets, thus making the architecture a more understandable target for compilers and programmers. ‘One sees a correlation in many architectures between the instruction- | 1FR REQUISITES FOR IMPROVED ARCHITECTURES set size and the number of enomalies in the architecture, For instance, the 8/370 provides two binary deta types of 16 and 32 bits. The 32-bit oriented instructions include ADD, SUBTRACT, MULTIPLY. and Di- ‘VIDE, but the 18—bit oriented instructions exclude division. The HALVE. instruction is provided to divide a floating-point operand by 2 (whose existence. incidentally, is puzzling because there is a DIVIDE instruction), but there is no DIVIDE BY 2 instruction for binary of decimal integer data ‘types (contrary to belief, an arithmetic right shift is not equivalent to a division by 2 (31). Simpler Compilation. Compilers for tagged machines will be less ‘complicated and faster. (As discussed in Chapter 3, we consider this @ pleasant side effect rather than a motivation for tags.) In a conventional system, the code-generation process is complex because the compiler must do a semantic analysis of the program to determine the proper machine instructions to be generated. For instance. when a compiler encounters the + operator, it must examine the expressions on bath sides of the operator in order to know which add instruction to generate. In a tagged machine, however, the compiler would simply generale the generic add instruction. Where a compiler in an untagged machine might have to decide among literally hundreds of code sequences for the statement IF A=B THEN... the compiler in a instruction always. Code generstion is further simplified in that the machine, not the compiler, is responsible for run-time checks and automatic data conver sion. aged machine would likely generate the same Program Debugging Benefits. The construction of program debugging tools will be simplified. At the least a crude debugging tool for a tagged architecture can produce a more meaningful storage dump (i... printing data in its proper representation) than a conventional dump of memory at a long sequence of binary, octal, ot hexadecimal digits. The possibility of placing trap bits in tags increases the feasibility of sophisticated debugging tools. In addition, any architectural concept that moves the machine interface closer to the programming language should aid the development of language-oriented debugging tools Data Independence. Because the use of tags means thet the binding of program logic to data attributes is deferred to the last possible time, it aids one in writing generic programs and in the implementation of the data-independence concept in date-bese systems. Among other things, data independence implies that an application program can view @ data-base record in a way that is different from the physical representation of the record. (As a simple example, an application program can decide to view a field as a decimal number, although in the data base the field might be represented in base 2.) This concept is difficult to implement in current systems because machine instructions contain information about the type and lengths of their operands. making recompilation necessary whenever the data base definitions change. in a tagged architecture, howaver, if instructions are generic and if tags are extended to secondary storage, implementation of data independence becomes more feasible. auSELF-DEAMING DATA Storage Requirements. Another advantage of tagged storage. one that is widely misunderstood and counter to intuition; is reduced storage tequirements. A program and its data should require fewer bits of storage than the same program and data in an untagged machine. This characteristic is discussed in more detail in the following section. Storage Requirements of Tags ‘The most common criticism of the tagged-storage concept is the feeling that tags inctease the system's cost by increasing the amount of storage needed. For instance, intuition might tell us that adding a 4-bit lag to a aystem with 92-bit words will increase the cost of storage by 12.5%. However, intuition is quite wrong here; a tagged-storage system is likely to require fewer bits of storage than a conventional architecture. The first reason is considerable attribute redundancy in the store of an untagged machine. If a floating-point operand is referenced by many instructions, each instruction redundantly identifies the operand as floating point {because bits are needed in the operation-code field to distinguish floating-point operations from operations on other data {types}. Since variables tend to be referenced by more than one instruction, it makes more sense, if one is interested in reducing storage consumption, to store the static information (o.g.. each operand’s type) once with the operand rather than repeating it in each instruction. As a simple example. consider an architecture X with 150 instruction types where the operation code is expressed in an 8-bit field in the instruction, We wish to consider an alternative architecture Y using tags where a 3-bit tag is added to each data cell in storage. Y can now be dofined with generic instructions: a reasonable approximation to make is that Y needs only 50 instructions types, which can be represented in a G-bit operation-code field. To compare the storage requirements of the two architectures, we can compare the number of operation-code and tag bits (call this sum B) needed in a program. making the assumption that all other storage requirements are constant. If] represents the number of machine instructions in the program, then B for machine X is 1. On machine Y. B is equal to Gt + SP, where P is the number of operands in the program. If each instruction in both machines references two operands, and if each operand in the program is referenced an average of F times, then B = 61 + 1/R on machine Y. The ratio of B for machine Y over B for machine X is 0.75(1 + 1/R). If this ratio is less than 1, the tagged machine requires less storage. Evaluating the ratio requires knowledge of R: the average number of references to each operand. One would expect A to be greater than 1. The cutover point is 3; at this point the two machines require the same amount of storage. This number R is not a widely measured factor. but one measurement of A for a set of programs yielded the value of 10.4 [4|. A measurement mads at the source-language level for Cobol programs indicated that each data item was referenced an average of 3.5 times [5}. If 10.4 is representative in the machine version of a program, one would ‘expect the space needed for operation-codes and tags in a typical program ‘on this tagged architecture to be 82% of the space needed for operation- codes on the untagged architecture. Using R = 3.8. the ratio would be 96%. Hence a tagged architecture might require fewer storage bits because of 2REQUISITES FOR IMPROVED ARCHITECTURES the elimination of redundancy in instruction operation-codes. This coun- terintuitive phenomenon substantiates a point in Chapter 1: the architect, ‘must have a thorough understanding of programming languages and the characteristics of programs in order to make intelligent decisions. ‘A tagged architecture also reduces a program's storage requirements by eliminating a second type of redundancy: the instructions that are repeatedly generated by compilers to perform run-time checks and automatic data conversion. Ae an example, assume that PL variable lis declared as FIXED BINARY(31) and A and D are declared as FIXED DECIMAL(4.2). Using IBM's PLT optimizing compiler on the $/370. the generated code for the statement D=D+A occupies 6 bytes of storage, but the code for D=D-+] occupies 64 bytes (an increase of 967 %), as shown in Table 4.1. The difference is due to the code generated by the compiler for the data conversion. If one enables the option to check for an overflow when storing into D, the statement D=D+I genorates 92 bytes of object code. As indicated in Table 4.3, the increase in execution time is even more significant: executing the mixed-mode statement D=D+1 takes 3011% moze time than executing the statement D=D+A (measured on an '$/370 model 145). ‘This conversion and checking code is typically generated in-line where needed. {lt need not be; the code could exist in one place and subroutine calls could be genefated. but this usually proves to be excessively inefficient.) A tagged architecture eliminetes this redundancy by delegat- ing these functions to the machine, Of course the functions still have to exist somewhere; for instance, in a microprogrammed machine they ‘occupy space in its control storage. However, in a tagged machine the Table 4.1 The Cost of Software Automatic Det Conversion and Overflow ‘Checking Machine Sia Executon Time Statement Instructions (byes) __(mcreseconas) D=0-a, 7 6 Tat O=0+1 3 64 4076 (SIZE) O=D+A 7 3B S77 (SIZE): O=D+! 19 92 448.1 functions exist only once (in control storage) rather than being replicated hundreds or thousands of times in the programs residing in main storage. In the earlier discussion of the effects of tags on execution speed it was noted that some languages require dynamic attribute binding. When tagged storage is not present, this is implemented by compiler-generated code, another consumption of storage space. In the example used (compare C to D), the compiler generated 156 bytes of code while. if dynamic attribute binding were not used (j.e., if C ware declared with a fixed Tongth), only 6 bytes of code would be needed. The last reason why tags do not require as much storage as first appears is that they need not physically exist for every data element. For instance, there is no need to store a tag with every element in an array. Since, by definition. array elements have identical attributes. the tags can be factored out; only one tag is necessary for the array, Similarly there is no need to tag each element of a character string; only one tag is necessary to define the attributes of the elements. This is discussed furthor in a later section,SELFOEFINING DATA Disadvantages of Tagged Storage ‘As might be expected, tagged storage also has a number of disadvantages. (One disadvantage is that i{ croates a strongly typed data environment, and many existing languages contain quirks that are incompatible with such an environment (see Exercise 4.1). ‘A second disadvantage is execution speed. Although the discussion earlier implied that tags have some benefits with respect to execution speed. there is also a disadvantage. A characteristic of tagged storage that is defers, to the time of instruction execution, the binding of the instruction to the attributes of the data. A penalty of executiontime binding is that it must be performed repetitively (e.g.. every time the instruction is executed), while earlier binding (e.g.. at compilation time) can be performed fewer times. In other words. a valid question is: If state in my high-level program that 1 am adding two binary integer variables, why not generate a binary integer ADD instruction? Why generate a genetic ADD instruction, such that. upon each execution. the machine ‘must (1) fetch both fags, (2) check them to ensure that the operands are ‘numeric, and {3) use them to determine that aa integer addition should be done? ‘The question is pertinent because, although one can devise logic to reduce the additional processing implied, it is likely that the minimum penalty will be a few extra machine cycles per instruction. This must be ‘weighed against 1. The manner in which tagged storage, as discussed earlier, can benefit, execution speed. 2. The fact that run-time binding is implied by certain constructs in languages, and that this is extremely inefficient in an untagged machine. 3. The consideration, as discussed in Chapter 1, that execution speed is @ microscopic measure of system performance, and the realization that tagged storage has several macroscopic advantages. Tagged storage, although not a new concept, has not been studied sufficiently to answer the execution-speed question. A likely outcome, however, is that it will probably decrease efficiency somewhat for the execution of some programs, namely, those that are error free, perform few data conversions. and whose semantics are fixed at compilation time, and increase efficiency for other types of programs. ‘Tags versus Descriptors One step between a tagged and an untagged environment is the use of data descriptors. A descriptor is attribute information and an indirect ‘address to a cell. Figure 4.3 contrasts the two, In a descriptor architecture, instructions refer ta descriptors which supply attribute information and. in turn, usually refer to a location in a memory space where the operand’s value is stored. The primary differences between the tagged approach (Figure 4.3a) and descriptor approach (Figure 4.3b) are: 1. Descriptors involve an extra level of addressing and hence add ‘execution overhead and require more space.REQUISITES FOR IMPROVED ARCHITECTURES, Descriptors are usually considered to be part of the progfam. rather than part of the data. If we consider the program (instructions) and data to be separate entities (as they are in applications involving shared data, files, data bases. etc.), descriptors do not provide strongly typed environment (¢.g.. another program could view the data space with different descriptors, thus giving il values that are inconsistent with another program's descriptors} Spee Tagg cnt Beseron ae a lar] vale awe Fal rl al eI o o Figure 43. Tege versus descriptors. 3. Because of point 2, the data-independence property that can be achieved with tags is absent if descriptors are used instead. 4. As implied in point 2, descriptors give one a view ot template to be used with storage, while tags define the actual data representation. 5. Because descriptors typically index into a linear storage space, they do not eliminate this traditional concept, one that we will be trying to eliminate throughout this chapte Hence one can consider descriptors to be a step. with deficiencies. toward tagged storage. ‘Most of the architectures referenced in Chapter 3 incorporate either tags or descriptors. Some (e.g.. Burroughs 6700) use both. The outer level architecture of the [BM System /38 uses descriptors rather than tags. ‘SELF-DEFINING DATA COLLECTIONS The self-identification concept can be carried even further to include less primitive and multidimensional data objects such as arrays, tables, and structures or records, Although tags can be used to describe such objects, the method most often used is the descriptor discussed in the previous section. ‘As an example of the use of descriptors to describe collections of information, we will examine the use of descriptors to define arrays in the Burroughs B6700. The word size in the B6700 is 51 bits. where the first 3 bits represent a tag, Ifa tag has the value 101, the word is a descriptor. One type of descriptor. a word descriptor, is illustrated in Figure 4.4. The P (presence) bit indicates whether the described data is in main or second- 3sFE ‘SELF-DEFINING DATA > arystorage. The C (copy) bit indicates whether this descriptor.is a copy of another descriptor. The I (indexed) bit indicates whether the descriptor points to an entire aggregate of data, or just one element within the aggregate. The S (segmented) bit indicates whether the described data is ‘contiguous or segmented in memory. The R (read-only) bit is set if only reads are permaitted to the described data. The T bits, which would be zero in this case, define this descriptor to be a word descriptor (as opposed toa string descriptor). The D bit indicates whether the described data is single ‘or double precision. If the I bit is off, the next field in the descriptor dofines the number of elements of described data, and the last field points to the beginning of the described data. ‘The 36700 descriptors can be combined into tree structures to describe multidimensional storage objects. For instance, Figure 4.5 shows the storage structure for a 3 by 4 array of words. The array descriptor points to ‘a three-element vector of descriptors which in turn point to four-element vectors of data words. One implication of this is that the machine is responsible for array indexing calculations and bounds checking. To obtain a particular element, the program simply names the array descriptor and the values of the subscripts. The machine calculates the address of the referenced element and. at the same time, determines whether the subscripts are within the defined bounds. Another machine using descriptors to describe arrays is the MUS (61. ‘One type of descriptor is the vector descriptor (the other is a string descriptor). The vector descriptor consists of three fields of 8. 24, and 32, bits, The fist field defines the length of each element of the vector (1.2.4, 8, 16, 22, 64, of 128 bits). The socond field defines the size of the vector (the upper bound), and the third field specifies the beginning memory location of the vector. Like the 6700, vector elements can themselves be descriptors. ‘A more frequent type of arrsy, particularly in commercial and non- ‘numeric applications, is the table. A table is normally thought of as a one-dimensional array where each element, rather than being 2 scalar value, is an ordered collection of possibly heterogeneous data types. For instance, the following definitions in the PL and Ada languages define a 10-element array TAB. Each element consists of an &-character field and an integer-valued field. fore el ] oh I asares TTT Ty 7% 7% Fgura 44 6700 word descriptor. ‘With respect to the PL/l example, one can treat NAME and AGE as if they ‘were arrays themselves. For instance, the expression TAB.NAME() has. 1a its value, the value of the NAME field in the Ith element of TAB. ‘The MUS contains an additional descriptorlike concept called a dope vector to describe arrays with lower bounds other than 1 and table ‘columns (or slices). The dope vector contains three 32-bit fields. The first two define the lower and upper bounds and the third contains the “stride,” the offset of the slice within the array element. In the program above, dope vectors would exist for NAME and AGE. ‘Although the B6700, MUS, and similar architecture significantly extend the storage model toward that of languages. one can extend the model further. For instance, with descriptors the storage organization ofFe REQUISITES FOR IMPROVED ARCHITECTURES ra Ly : c ra oS ee ral wl x ra oi a (a xa xo x] co ox oe oy gure 43 Repreeinion ot a two-dimensional aay DECLARE 1 TAB(10), type PERSON is 2 NAME CHAR(8), record: 2 AGE FIXED BINARY; NAME : STRING(1..8); AGE : INTEGER; end record: TAB : array (1..10) of PERSON; the array elements is visible. One can bypass the descriptor and refer to the array storage directly, thus violating the model. Also, making the array organization visible restricts ils implementation (j.e.. the elements ‘must reside in continguous locations. all rows and columns must be the ‘etc.). Another consequence is that the tags for the elements @ previous section. For instance, in the B6700, each array element has a redundant tag "The SWARD architecture of Part V overcomes these weaknesses by recognizing the array as a legitimate data type, rather than using desctip- tors. The format of the array cell is shown in Figure 4.6. The last two fields of the cell are of particular relevance here. The nestedtag field is the last field of the tag and accomplishes tog factoring, as mentioned in the previous section. Rather than store a tag with each array element, attributes of the elements are defined within the array tag by placi within this field, a tag of a scalar cell type. A cell representing & two-dimensional array of fixed-point scalars of form NNN.NN each dimension containing 10 elements, would have the following form: where E52 represents the tag that would have appeared in a cell for an element. had the element not appeared in an array. Note that the last field (value) has a fixed size. independent of the size of the array. The architecture requires that the values of all array elements tbe encoded here, but since the field cannot be seen directly by a program. the encoding is not architected end is left to the implementes. Obviously. rather than storing element values here, it would be used in an implementation of the architecture as a means to indirectly locate the element storage. However. it has the effect of hiding the actual mapping of the array elements to storage locations. a‘SELFOEFINING DATA COLLECTIONS. ‘The SWARD architecture contains another cell type. the record cell, to describe structures records and arrays of them (Le..as in the PL land Ada examples above). This is described in Part V. Advantages of Self-Defining Data Collections ‘As one might expect, the advantages of extending the self-descriptive property to obiects such as artays and structures are similar to those of tagged storage. Error Detection. ‘The machine is now able to detect such programming arrors as the use of an array subscript whose value is beyond the bounds of the corresponding array dimension, another rather common error. The price azar | ores Es I = Vue 724 7 rs Niet of ienonn x26 Figure 44 Array catia SWARD. overhead of such a check by the machine should be negligible, particularly in the light of the execution-speed gains discussed below. in Chapter 2 we saw why most compilers find « software check for this error impractical; if the optional SUBSCRIPTRANGE check is enabled for the PL statement Cd) = All) + BUD; 57 5/370 instructions are executed rather than 17, and the size of the generated code is 274 bytes rather than 62, Execution Efficiency. Because the machine now performs such operations as array indexing, rather than executing compiler-generated code to doo, array indexing will be faster than in a conventional system (largely due to the decrease in memory accesses for instructions and the decrease truction decodings). Ifthe extension of generic instructions to these jects provides operations on entire arrays, array operations will be ‘much more efficient for the same reasons. For instance, one could set all elements of an array to @ given value in one instruction, rether than executing a compiler-generated loop to do s0. ‘Storage requirements, Because tha above advantages require fower generated machine instructions, the storage requirements of programs will be reduced. The PLM statement shown above generates 17 $/370 instructions occupying 62 bytes. Most of these instructions are generated for the array indexing operations, and these instructions are redundantly repeated in storage for every array reference in the program. Implementation Opportunities. Whenever we raiso a machine's perspective through an architectural change, we should expect added opportunities for exploiting additional parallel logic to achieve higher execution speeds. One example is array subscripting. If a two-REQUISITES FOR IMPROVED ARCHITECTURES dimensional array is stored in sequential locations. referring to element IJ usually implies computing a formula such as oo element address = base address + 1H ~ H +J -1 giving one the opportunity, if the machine is responsible for array subscripting, to overlap the multiplication with the addition and subtrac- tion operations. If the generic instructions provide for array operations, fone has the opportunity to operate on several elements in parallel. A processor with a cache memory can load the cache in a lookahead, rather than demand, basis when operating on an array. ‘Tags, Objecta, and Descriptors Jn this and the previous sections the distinctions between data identification by tags and descriptors was illustrated. A related concept. that of an ‘object, should also be introduced. To indicate that there are semantic differences among the three, the Burroughs 86700 uses tags on each physical memory word and uses descriptors to describe vectors, the 'SWARD machine is based on tags and objects, and the Intel LAPX 432 provides objects, some types of optional descriptors, but no tags. ‘An object is an abstraction that is usually considered to have the following properties: 1, Ttis a related collection of information. The information may include pure data, state information, and others. 2. The entities or parts in an object do not have independent lifetimes (le. they are created and destroyed together). 3. The object is most often addressed as a whole, rather than referencing its parts individually. 4. A.set of machine instructions is pravided to perform ‘transformations on the object. 5. The internal structure of an object is not visible. 6. It contains self-identifying information to prohibit « program from performing operations on the wrong type of object. ‘Programmers should recognize a strong similarity between these characteristics and the concept of abstract data types, although here of course ‘the abstraction is one created by the machine. ‘As an example, both SWARD and the Intel iAPX 432 define an object type called a port. A port object is used to send data from one process (ot task) to another. The internal structure af a port is not of concern to program; itis operated on by addressing it as an operand of «related set of, machine instructions. ‘A possible relationship between tagged data items and objects can be seen in the SWARD architecture. An object could be a single tagged data cell, a collection of data cells and other information, or something containing no data cells, The machine allows a program to create ‘dynamically an occurrence of a data type (analogous to execution of an ‘ALLOCATE statement in PL/I or an allocator in Ada to create an array. string, record, single integer, and so forth). Such an occurrence is called a data object since it is a single data cell whose lifetime is independent of ‘other data cells. Another type of object. a module, contains: machine instructions, a set of tagged data cells (for any static or own variables used‘SELF.OERWING DATA COLLECTIONS by the instructions) and othr state information The partisan example of an object containing no date cells. ‘SMALL PROTECTION DOMAINS One characteristic of present-day architectures is that they provide an inflexible and coarse data-protection mechanism {and most microproc sors provide none at all). Typically an architecture might provide a mechanism for protecting system software and data (eg. the operating system) from modification and/or inspection by application programs, and a mechanism for protecting one application program from another. However, the following are missing: 1. Protection of a program or process from itself, ot more meaningfully, Protection of one section (e.g., procedure) of a program from other sections. 2. Protection of a program from the system software. The concept of small protection domains is to provide a finer granularity of protection within a system (protection meaning both from acciden- tal reference, or program errors. and from willful reference, or security problems). A protection domain is an independent local address space defining the total set of addresses that can be formulated by a set of instructions (7-10), A smail protection domain implies a larger number of small address spaces (j.e.. something more granular than one per ‘Frogram [process or one per operating system), The idea is to segragate sections of a program structure into separate protection domains. The definition of a section is somewhat dependent on the addressing structure and scope of identifiers in a particular language (and the architecture should be flexible enough to accommodate differences among languages), but it is usually equated to the compilable unit (e.g.. package, procedure, subprogram, monitor). Hence rather than placing an entire program in a single large address space, each module is placed in a separate address space such that each has access to oniy its local variables. ‘One notion that is obviously missing is a method to communicate cross domains. This is done by employing a subroutine management mechanism (discussed in the next section) which controls the passing of arguments and parameters. One can view the passing of arguments to a subroutine in a protection domain as temporarily extending or augment- ing the domain with the arguments. as indicated in Figure 4.7. ideally B's domain is extended to only the specific arguments transmitted (i.e.. it has addressability to nothing else in A), and the extension disappears when B retums to A. at cote ocala A Wome ® Figure 4.7 Twa protection domains.REQUISITES FOR IMPROVED ARCHTECTURES Protection domains also often imply a concept of protected points of entry. Within each protection domain, the programmer should,be able to define one or more specific points of entry, such that it is impossible to enter of call a module in a domain at any arbitrary instruction. Advantages of Protection Domains Program Debugging. Small protection domains have the effect of building firewalls around sections of a program, providing a higher degree of error confinement. While in a conventional system. an addressing error coutd have an effect on any part of a program, or an error in the operating system could affect other parts of the operating system or application programs, protection domains limit the effect of an error to within its domain (including any extensions to the domain made by argument passing). One is more likely to detect such errors earlier and. importantly, to detect them at the point of occurrence. Security, Small protection domains protect information in one part of a program of process from other parts, thus providing a level of protection. thet is normally absent. This can be termed a “principle of least privi- loge.” where each module has access not to everything, but only to the data it needs to perform its function. ‘One requirement for this type of security is in programs having sections of differing security levels. or where a program calls a service (2.6. operating system, subroutine package) having a different security level or different degree of trustworthiness. In Figure 4.7 assume that module A contains sensitive data, and it wishes to call module B to perform some service, while B belongs to another programmer, is part of a general subroutine package, or is an operating-system function. A potential security problem is that B may be a “Trojan horse,” that is. a foreign program with disguised intentions brought into A’s protection walls. If the act of invoking B gives B addressability to all of A's data, 8 can ‘examine sensitive data within A while performing its function. If protec: tion domains are employed, B's access is limited to only the specific arguments passed. ‘The security prablem can also apply in reverse. Module B may contain a sensitive data base of information which it collects over time by being called by other modules, such as A. Without protection domains it is likely that if A has an entry address to B (to call it), A can perform arithmetic on this address to obtain addressability to B's socret data, ‘Another situation might be A's examining space on the subroutine stack after B returns to examine the values of B's local variables. Enforced modularity. It is well understood (e-g., (11)) that simply splitting a program into many modules does act necessarily imply a well-structured modular program. In particular, modularity can be compromised by modules that interact with one anothet in obscure ways (e-g.. by directly accessing one another's data, rather than doing so explicitly via parameters). Protection domains present a barrier to constructing programs with obscure and hidden module interfaces, and hence promote desirable forms of modularity ‘The representation of protection domains in an architecture is not immediately obvious. For instance, it requires mechanisms to ensure that when a module is given access to an argument. it cannot access other data “‘SELF-OEFINING DATA COLLECTIONS in the domain of the argument. and mechanisms to ensure that modules are entered at only their defined entry points. Rather than discuss these mechanisms at this time, the reader is referred to the SWARD architecture in Part V. another architecture containing protection domains [121, ¢ Proposed mechanism for atgument (ransmission {13}, and a software implementation of protection domains on a PDP-11 (8) ‘SUBROUTINE MANAGEMENT Another major step in closing the semantic gap is to make the machine ‘aware of the concepts of program structure. particularly the important ‘concept of the subroutine or procedure. The motivation for this is the Tengthy process that is performed by software on conventional machines to create the subroutine concept. For instance, a typical high-level- language program has to perform the following steps. via compiler- generated code, when a procedure is called: (1) a block of storage called an activation record is dynamically allocated to hold the local variables and status information of the called procedure, (2) this activation record is added to the software-maintained stack of activation records for the program, (3) the status of the calling procedure is saved in its activation record, (4) local storage and the parameters of the called procedure are initialized, and (8) a branch is taken to the called procedure. As mentioned in Chapter 2, this process is performed quite frequently, and therefore passes one of the tests for a hardware ‘software trade-off. Studies have shown that a subroutine call occurs once per 50-100 machine instructions executed (14, 151. Closing the gap implies defining architectural constructs that embody ‘some or all of the steps mentioned above. As with other concepts, there are a variety of degrees to which the gap can be reduced. Here we consider ‘two, one which does not significantly change the traditional storage model but provides instructions that assist in developing the subroutine concept, and a second which alters the machine's storage model to resemble more clasely that implied by languages. These approaches ‘might alao be termed the visible versus the hidden activation stack. ‘The first approach is typified by: 1, Addressing mechanisms that permit multiple program-managed push-down stacks to exist in storage, one stack per process, and provide relative addressing within each stack. 2. Instructions that allow a program to manage its stack with respect to parameter passing, return addresses, and space allocation for local variables, The Motorola 68000 microprocessor will be used here as an example of this approach. ‘The 68000 contains eight 32-bit data registers and efght 32-bit address rogisters. One of the address registers (A7) is implicitly addressed by a small number of instructions and is known as the stack pointer (SP). Commonly, when implementing a subroutine mechanism, another address register is designated duty as the frame pointer (FP). The 68000 provides a large set of addressing modes (i.¢., how storage \ddresses are formulated by instructions), and its instruction set is highly generic with respect to addressing modes (i.e., any of the modes can be 2REQUISITES FOR IMPROVED ARCHITECTURES ‘used in most of the instructions). The modes include absolute. register indirect, register and displacement (positive or negative), register indirect with predecrementing or postincrementing of the register value, indexed, and program-counter relat ‘Typically, the activation stack established in storage for a process contains a number of frames (sequences of words in the stack). Each frame ‘represents an activation record for an active subroutine, containing space for local variables, a return address, space for saving register values, and ‘space for arguments to another subroutine. The FP normally points to the beginning of the current (top) frame or activation record. and the SP points ta the top of the stack (the last word in the current frame). Register-displacement addressing can be used to address storage within a frame {i.e., telative to FP), and register-indirect addressing with predecrementing or postincrementing is used ta push or pop items onto and ff the stack (ie., onto and off the current frame). To examine the subroutine mechanism in the 68000. assume that we ‘are executing procedure A and that the current state of the stack is that shown in Figure 4.60. (The addressing modes are defined such that the stack must grow downward in storage.) Suppose that A wishes to call procedure B at this point as a result of the statement CALL B (7X) Four instructions are used in A to accomplish this. The first, MOVE with addressing model of SP-register-indirect predecrement, pushes the value 7 onto the top of the stack. (We assume that 7 is to be transmitted by value). The second, PUSH EFFECTIVE ADDRESS, is used to push the address of X onto the top of the stack (to be transmitted by reference). The third, JUMP TO SUBROUTINE, pushes the return address (address of the w ” Figure 4.8 Motorola 68000 subroutine stack.‘SMALL PROTECTION DOMAINS. next instruction) onto the stack and alters the progtart-counter value, The fourth. which would be executed upon return from B,"is an ADD instruction to add 4 (two words) to SP. thus having the effect of removing the arguments from the stack. Procedure B would begin with a LINK instruction. IfB needs 24 bytes of space for local variables. the instruction would be LINK FP,24 This does the following: (1) the current value of FP is pushed onto the stack (thus saving, in this new frame, the address of the previous frame), (2) the current value of SP is stored in FP, and (3) SP is incremented by 24. ‘The second instruction in B, MOVE MULTIPLE REGISTERS, would push. onto the stack. any registers that are going to be used in B. The stack now has the form shown in Figure 4.86. In procedure B, local variables are addressed via a negative displacement from FP, and arguments are addressed via a positive displacement from FP. Procedure B would exit with three instructions. The first, MOVE MULTIPLE REGISTERS. would restore designated tegister values from the stack. The second, UNLINK, places the value of FP into SP and then ops the current top stack value (the address of the previous frame) into FP. The last, RETURN FROM SUBROUTINE, pops the top stack value into the program counter. Although this and similar mechanisms take much of the subroutine: management burden from the shoulders of compilers, they still have a number of disadvantages: Integrity of the mechanism is dependent upon interacting procedures obeying a program-defined protocol. Any deviation from the conven tions (.g., 2 mismatch betwoen the number of arguments being transmitted and the number expected. failing to use the same address register as FP in both procedures, allocating the wrong amount of space with @ LINK, failing to save and restore properly the right registers) can cause the program to fail in a surprising manner. 2. The mechanism does not introduce a new storage model more related to that in languages. Instructions in a called procedure can modify system information in the stack and have access to stack frames of other processes (e.g.. it does not achieve protection domains). 3. If local variables are to be initialized upon procedure entry, additional instructions are necessary, 4. The stack is sequential, meaning that its maximum size must be predetermined for each process. and each stack must be preallocated in storage with its maximum possible size. These problems can be overcome by hiding the stack mechanism within the machine and providing a higher level mechanism in the architecture. Such is done in the SWARD architecture discussed in Part V. In this architecture each compiled program unit is represented by a machine object called 2 module. A module is a collection of tagged storage cells and machine code representing one or more procedures. The cells can be separated into two groups: a group for which new occurrences are to be dynamically allocated whenever the module is invoked. and another group whose lifetime is independent of invocations (¢.6..REQUISITES FOR IMPROVED ARCHITECTURES: static or own variables). Each cell can be given an initial value by the compiler. ‘An entry point or procedure of 2 module is entered by“a CALL instruction, whose appearance is much like that of a CALL statement in a language. It specifies, as operands, an address of an entry point of @ procedure in a module. and a list of arguments. The CALL instruction performs the following: 1, Saves the state (e.g... return address) of the current module. Allocates an activation record for the called module, and copies into it all cells in the called module for which new occurrences are wanted. 3. Begins execution of the calied module The first instruction executed in the invoked module is normally ACTI- VATE, which names. as operands. a set of cells tagged as parameter cells. ACTIVATE causes the machine to check each argument and correspond ing parameter for type consistency (using their tags) and causes each parameter to refer to the corresponding argument. A RETURN instruction undoes the effect of a CALL. ‘The difference between this and the earlier mechanism is that very little of this mechanism is exposed in the architecture. The existence of an activation record and a stack of activation records is mentioned in the architecture specification, but their actual format, location, and linkage mechanism (eg., sequential versus list) is known only to the machine implementer, and hence can vary from implementation to implementation. The only view the programmer (or compiler writer) has is the CALL, ACTIVATE, and RETURN instructions: the manner in which storage is ‘allocated, parameters refer to arguments, and so on is hiddan by the architecture. The integrity of the mechanism cannot be compromised (eg., cause wild branches, modify linkage information) by program actions. The mechanism also implements protection domains (each module is one). Advantages of Subroutine-Management Mechanisms Execution efficiency. Like other concepts, placing some or all of the ‘subroutine-management function in the machine interface leads to higher execution speeds because of more effective uso of the available tocessorimemory bandwidth. Given increasing trends toward highly ‘modular structured programs, this advantage is an important one. Storage requirements. Like other concepts that raise the machine interface. a subroutine mechanism decreases the size of programs by decreasing the amount of compiler-generated code needed to implement subroutine calls. Error detection, If tagged storage is used in concert with subroutine management. as is done in SWARD, the machine can detect interface errors (@.g.. passing four arguments to & subroutine expecting five. passing @ character string to a subroutine expecting a floating-point number. ‘An alternative (eg. ina modul doing this checking prior to execution by software nding, or linkage-editing, step. or as part of compila- “‘SUBROUTINE MANAGEMENT tion, as required by the Ada language). Although certainly worthwhile, not all interface errors can be found prior to execution far the following reasons: 4. In some programs the attributes of the arguments are not known prior to execution [e.g., a procedure dynamically determines what argu- ‘ment to pass). 2. Systems normally include some interfaces (e.g., invocations of operating-system functions) that are not bound at linkage-edit time. 3. Some systems and languages contain dynamic procedure-binding facilities, allowing @ program, at execution time. to bind itself to designated procedures. 4 In some situations one does not know, prior to execution, what procedure will be invoked by a particular CALL statement (e.., the Use of entry variables in PL). Implementation opportunities. Again, raising the machine interface presents the processor designer with opportunities for parallel logic. For instance, one can envision that space allocation for the activation record could be overlapped with state saving and argument /parameter type checking. Given that an invoked procedure is likely to refer to cells in its activation record with high frequency, the activation record (or as much of it as will fit) can be allocated directly in a high-speed buffer or cache. Since, even in a multiprocessor system, the activation records of « process should be local to the process. the processor need not bother with “store through” write operations to the activation-record cache. CAPABILITY-BASED ADORESSING Another major step in reducing the semantic gap between the software envionment and the machine interface, and improving upon the primitive storage model of the von Neumann architecture, is the cancept of capability-based addressing. Like many of the other concepts, capabilitv- based addressing can be implemented in various degrees of sophistica- tion. Here we will consider it in a rather extensive form. To statt, rather than thinking of a storage model consisting of a linear sequence of words, one thinks of the system environment as being a single set of objects. The word single implies that there are no preestablished boundaries on addressability: everything in the environment is poten tially addressable by everyone. The word set implies the absence of ordering; there is no concept of one object being “next” to another. As discussed easlier, the word object signifies a group of related storage elements with the same lifetime destroyed together). Depending on the architecture, an object might be as Primitive as a storage segment (a self-contained linear sequence of words, such that its interior has the appearance of a traditional von Neumann store), or an abstract entity whase physical form is hidden by the architecture and whose meaning is defined solely by the transformations that can be performed on it. Figure 4.9 depicts the storage model. The arrows represent references among objects, that is. they indicate that objects contain names or addresses of other abjects or entities within other objects. Note that the model encompasses the total system environment, be it a single-user or aFigure 48. Universal world of objects. multiple-user system. Each object is potentially addressable by any other object. but this addressability is not automatic and cannot be created ‘unilaterally. This statement represents the foundation of capability-based addressing. ‘At this point we do not want to get too specific about the nature of an object, since this might vary among architectures, However, as examples ‘we might think in terms of program objects and data objects. (Additional types are likely to occur in specific architectures.) A program object might represent a sequence of instructions and. possibly. a set of local variables. For instance, a program object might correspond to a PL/I procedure, an ‘Ada package, or a module. A data object might represent a dynamically allocated occurrence of a variable (i.e.. that storage created by the execution of an allocator in Ada). Other examples might be directories and files. The model implies that all program and data objects are potentially addressable by each other. thus implying a flexible base for program and data sharing. At the same time, the stateme: dressa- bility is not implicit and cannot be ostablished unilaterally implies a mechanism of protection. ‘Objects can be created only with the machine's participation {i.e.. via machine instructions). When an object is created. its name is given to the result of the machine instruction executed to create the object). In the general form of capability-based addressing, the ‘name itself bears no relationship to where the object may be stored. The ‘name does represent the “address of the object, but only in a logical sense; the machine maintains whatever information is necessary to transform a name into a physical location. Hence the name is an arbitrary set of bits whose meaning is known only to the machine. ‘There is also motivation. discussed later, to make all names unique. Hence when an object is created, the machine assigns it a unique name (a ‘name never used in the past for another object. and never to be reused in the future for another object). Obviously one must compromise slightly with respect to “never.” since it implies that names would have to contain an infinite number of bits. To compromise, one uses fixed-size names. but large enough so that the supply of unique names is adequate. given the system's expected lifetime, and large enough so that if names have to be roused eventually, the motivation for making them unique is not compromised. For instance, 48-bit names give the system a supply of over 10!¢ unique names. °SUBROUTINE MANAGEMENT A.capability is an occurrence of one of these names. Thus a capability is similar to, and used as, an address. Capability-based addressing functions by (1) protecting capabilities and (2) prohibiting the fabrication of capabilities, Protecting capabilities means prohibiting 2 program from modifying @ capability (e.g.. manipulating it such that it refers to a different object). Prohibiting the fabrication of capabilities implies that programs must not be able to generate a capability as they can generate an address in a conventional architecture; a program can obtain a capability only by (1) creating an object or (2) receiving it from another program. ‘These ground rules can be enforced either by requiring that capabilities be stored in only special protected objects (capability ists) or by designating a capability as a tagged data type. and then restricting the types of operations that can be performed upon them. “These ground rules shew how capability-based addressing provides a protection mechanism in a storage model where every object is potentially addressable, One controls addressability (access to objects) by controlling the distribution of capabilities. The only way to refer to an ‘object is by possessing a capability to it, and capabilities cannot be manipulated or fabricated. The protection mechanism is anelogous to a physical world in which all objects ate locked boxes, and capabilities are keys. When one creates a new box, he is given a key to the box. Keys can bbe duplicated and passed to others, but cannot be tampered with or forged. The protection model can be enhanced by associating access or authority information with capabilities. Rather than a capability simply being a name or logical address of an object. it is the couple—access rights, logical address. For the sake of discussion, let us define the access types of read (ability to uso information in the object), write (ability to store ot modify information in the object), and destroy (ability to destroy the object}. When one creates an object, the initial sole capability is returned by the machine with full (read write/destroy) access. Assume the existence of an instruction that allows one to remove specific access rights from a capsbility. Process A can then create object X. make @ copy of the resultant capebility, remove write and destroy access from the copy. and give the copy to process B, thus giving B addressability to the object. but with only read access. Returning to the physical analogy, each box has several doors. One, when opened, exposes the contents behind a transparent shield (rea access); a second door opens into the box but turns off the “interior light ‘when opened (write access): a third door, when opened, exposes only a switch which, when depressed, destroys the box. The access rights correspond to auxiliary notches on each key. controlling which door(s) of the box can be opened. The notches can be filed off but not added to a key. ‘A further enhancement is to permit capabilities to name individual items within an object. Given a capability to an object that represents a set of distinct items (e.g... an array. a collection of variables), one should be able to compute a capability to an individual item in the object. For instance. if process A hes a capability to array object X. it can create a capability to element Xi) and pass this capability to process B. This gives Baccess to this specific element. but to nothing else in the object. In fact, B should not even be aware that its capability refers to an array element, This use of capabilities gives one an extremely granular level of information sharing and protection. This implies that a capability should now be thought of as the triplet[REQUISITES FOR IMPROVED ARCHITECTURES > access rights /object name ‘designator of entity within object, and can be defined as a protected, system-wide name of, and authority to, an object or entity within an object. ‘The concept of capability-based addressing has existed since 1966 [16] ‘but has not yet had a strong influence on computer architectures, The concept is widely discussed in the context of operating systems [10. 17-20] and has been implemented in software in experimental operating systems such as CAL {21] and Hydra (22]. A form of capability-based addressing has been implemented in the architectures of the commercial Plessey 250 (23], IBM System/38 (24], and Intel iAPX 432 microprocessor [57], the experimental Cambridge CAP [25, 26] and 18M SWARD (27-28. 458] systems. as well as a variety of other proposals [12, 29, 30). Several fine general discussions of capability-based addressing also exist 31-35), Capability-Based Addressing in the SWAAD Architecture ‘The SWARD architecture (Part V) contains an extensive embodiment of capability-based addressing and thus is used here as an illustration, 23 well as a forum for discussing some of the problems generated by the use of capabilities. The architecture contains five types of objects in the sense used earlier. The objects are abstractions in that their ectual representation is not defined by the architecture; they are defined only to the extent of dafining the operations that apply to them. Four of the objects—modules, ports, data-storage objects. and process machines—are explicitly created and destroyed by programs, and the fifth type—sctivation record—is implicitly created and destroyed by the subroutine mechanism. ‘One of the tagged cell types in the architecture is the pointer, shown in Figure 4.10. The value of a pointer cell is a capability or the “undefined value.” The bit representation of a capability is not defined by tha architecture, but it consists of four access indicators and a “logical address" to an object or entity within an object. The access indicators are read, write, destroy, and copy. The first three potentially restrict the types of operations that can be performed on the entity referenced by th capability, and copy authority applies to the capability itself. In the physical analogy, consider a situation where person A wishes to giv key to person B, but also wishes (o preclude B from copying the key (e. to give it to a third party). A might do this by stamping “do not copy” an B's key. a mark which is honored by all key-copying machines. Hence the copy authority. if absent, prohibits the possessor of a capability from copying its value into another pointer cell. Whenever an object is created, the resultant capability has full (reed/ write destroy copy) authority. An instruction exists to remove authority from a capability. but not ta add it. For reasons of programming genorality. pointer cells can be used in a manner similar to addresses in conventional systems (although no arithmetic can be performed on capabilities). In particular, pointers are treated Toor Cosi = 7 7m Figure 4.10 Polar colt in SWARD.CAPABILITY-BASED ADDRESSING as other cell types and can be used as variables in programs, can exist within user data structures, and can be passed as argunients. This is equivalent, in the physical analogy, to being able to store keys, of a combination of keys and other treasures. in the locked boxes. As mentioned earlier, protecting capabilities in an untagged environment usually implies that capabilities can reside in only specially designated objects (capability lists}. Such is the case in the Cambridge University CAP system, where the use of separate capability lists is recognized as a weakness (26). Returning to Figure 4.10, it was mentioned that the bit representation of 4 capability is not part of the architecture (purposely, because a program can never “see” the bits directly, to allow implementation freedom), but it is helpful to see how the 84 bits are used in a prototype implementation. Four bits are used for the authority information, and 30 bits are used for the object's unique name (called a SON, or system object name}. If the capability refers to an item within an object, 24 bits are used as a displacement to the item's tag, and 24 bits are used as a displacement to the item's content or value. (The requirement for two displacements that a data item's tag is not necessarily physically contiguous with its value, an example being on array element.) The remaining two bits are special indicators. Henca the maximum physical address space is 2%, that is, a maximum of 2” objects, each having a maximum size of 2", The need to make each object's SON unique is to prevent a program from creating an object, saving its capability, destroving the object, and then, perhaps several weeks later. using the capability maliciously to refer to whatever object has been reassigned the previous object's SON. Obviously 20-bit SONs do not give an inexhaustive supply of unique names, but the supply is sufficient for two reasons. First, assuming an object creation rate of 10 per second (activation records, although objects, are not usually assigned a SON}, the machine does not exhaust its supply fot about 3.5 years. Second, even if names have to be reused, the security ‘exposure is low because of the tagged architecture (see Part V}. Given a reference to a capability, the machine obviously must have the ability to translate the SON to a physical storage location. This is done by ‘maintaining an associative map of SONs and physical object locations. Although capabilities are the only form of interobject addresses. they need not be "sed for intraobject addressing. As mentioned in the sections on subroutine management and protection domains, a program in ‘SWARD is partitioned into modules. where each module contains its own ‘address space, part of which resides in the module and the other part Tesides in the current activation record. Instructions do not u ‘capabilities to access their operands in the address space; capabilities a needed only when referring to something outside of an address space. For instance. if one wishes to refer ta an array that exists as an independent data-storage object, the instruction would refer to a “relocatable” cell in its address space. which has an associated pointer coll containing capability to the array object. (This mechanism is similar to the notion of based variables in PL and access variables in Ada.) In addition to the instructions that create objects, the architecture contains an instruction, COMPUTE-CAPABILITY. which, given an addressable operand. returns a capability to it. If one has a capability to an array object, one could use this instruction to create a capability for particular element. One could also use it to create a capability for a local variable. As mentioned above, although activation records are objects. theREQUISITES FOR IMPROVED ARCHITECTURES machine does not automatically assign each a SON upon creation (to avoid exhausting the supply of unique SONs, since activation records are created more often than any other type of object). An activation record needs a SON only if a program executes a CREATE-CAPABILITY instruction to a local variable. Hence the machine assigns an activation record 8 SON only if and when the first CREATE-CAPABILITY instruction naming a local variable in the activation record is executed. ‘As might be expected, capability-based addressing also introduces some new problems. Most of the problems have direct analogies in the key-and-locked-box model. The problems, and how they were solved in SWARD. are discussed below. One problem is the do not copy problem, the malicious passing of 3 capability by a process or subroutine to a third party. This problem was mentioned earlier and is solved by the inclusion of a copy authority in the capability. ‘A second problem might be termed the global retraction or “new tock” problem. Once capabilities to an object have been distributed, there appears to be no opportunity for second thoughts (that is, withdrawing the capabilities) short of destroying the object and recreating it (with ‘new unique name). The situation in the physical model is wishing to invalidate all existing keys to a box because (1) ownership of the box is being transferred, (2) one suspects that one or more of the current keyholders are using the contents improperly, or (3) one becomes uneasy about not knowing how the keys have been distributed. The solution in the physical model is changing the lock. In SWARD. the CHANGE-LOGICAL-ADDRESS instruction serves the same purpose. Given a capability (with destroy authority) to an object as an operand, the instruction causes the machine to forget the current SON of the object. reassign it a new SON, and return a capability with this new SON. Any further references with capabilities containing the previous SON will result in program errors. A third problem with capabilities is the ability to selectively retract thom. For instance. one might have given 10 people keys to a door, and then later decide that person D should no longer have a key. Although the physical model breaks down a bit here, @ possible solution is indirection, that is, rather than handing out keys to the box itself, one might hand out keys to a second box that contains a key to the first box. One can withdraw the authority of a particular person or class of people by destroying, or changing the lock of. one of the secondary, “key holding.” boxes. This problem was solved in SWARD by the addition of an indirect capability. An indirect capability is not an additional data type; itis any pointer-cell value that was created by the COMPUTE-INDIRECT- CAPABILITY instruction. The instruction has two operands, both of ‘which must be pointer cells. The logical address of the second operand is. computed and stored in the first operand, marking it as an indirect capability. ‘An indirect capability cannot be distinguished. by a program, from a ‘normal capability. Thus a program is oblivious to whether it is using a normal capability or an indirect capability. An indirect capability physically refers to another pointer cell, but logically refers to where the capability in the latter pointer refers. Any reference through an indirect, capability has the same effect as if the direct copsbility were used. Operations that can be performed on pointer cells (e.g.. copying their 3CAPABILITY-BASED ADDRESSING values into other pointer cells, removing authority) can be performed on pointers holding indirect capabilities. . ‘The indirect capability has several uses. One is security. or a solution to the selective retraction problem. Suppose that A wishes to give process B access to object X, but wishes to retain the ability to withdraw this access at any time. By giving B an indirect capability to a pointer (in A’s space) referring to X, A can modify the latter pointer at any time to withdraw B's access. ‘Another obvious problem in the physical model is thet of lost keys, the loss of all keys to a box. In terms of capabilities, the problem is the loss of ail capabilities to an object (e.g.. one creates an object and then mistakenly stores into the only pointer to the object). A related problem is the lass of necessary authority to an object (e.g...none of the capabilities to object X have destroy authority, meaning that X will occupy system space forever) ‘Actually there are at least two ways of viewing the situation. The second way. not the view in SWARD but the view in the Intei 492, is that an object having no remaining references is one that the program wishes destroyed, implying the existence of garbage-collection functions in the ‘machine to recognize and implicitly destroy such objects. in SWARD no solution was found that was not enormously inefficient or that did not violate the security of the architecture. Actually security concerns are not the only impediment. The only thread tying together the machine's knowledge of objects and the programs’ understanding of ‘objects is the SON in the capabilities. If the capabilities to an object disappear, the programs and machine have lost their only form of communication. ‘The architecture contains several considerations to lessen the severity of the problem, First. the instructions that create objects allow one to specify whether the object should be automatically destroyed by the machine when the creating process terminates, This largely handies the problem of having a large number of supposedly temporary objects linger in the system because a process terminated abnormally. Second, programs that create permanent objects are encouraged (but nat required) to deposit their capabilities in the operating system's directories. Third. the prototype implementation contains a garbage-collection mechanism to allow service personnel to search for. and recover or destroy. lost objects. ‘Another operational problem could be called the what-is-this-key problem. The closest physical analogy is having no information about the properties of a key on a key chain, or perhaps a key discovered on the ground, It represents a set of situations where the machine has useful information that is not available to programs. In SWARD some of these situations are the followi 1. There is no apparent way for a program to view the authority information in a capability (o.g.. to test a particular capability to determine if it possesses write authority). 2, Given capability, there is no apparent way to determine what type of entity it references (e.g., module. entry point in module, cell in data-storage object. port, etc.) 3, The machine possesses cortain state information about objects that is unavailable to programs. Examples of needed information are. Is this, ‘object designated to be automatically destroyed upon process termination? Is this module active? Are there any sets of cell values enqueued in this port? 2REQUISITES FOR IMPROVED ARCHITECTURES The problem was solved by the addition of a DESCRIBE-CAPABILITY instruction. Given a pointer and an array as operands; the instruction, returns information in the array describing the capability in the pointer {eg.. the authority it possosses and the class of entity to which it refers) and, if the capability refers to an entire object, state information about the object. Some of the state information is independent of the type of object; other information is not. The instruction does not, however, return any information about the contents of an object (e.g.. in the case of a data object, it does not describe anything about its cell types or values}. Advantages of Capability-Based Addressing Protection. Capability-based addressing is a simple and elegant mechanism for information protection. It eliminates the need for “all or nothing” protection mechanisms such as special instruction states (“supervisor state”) and storage-protection keys. The authority specifica tions in a capability give one levels of access often absent in conventional systems. Being a conceptually simple mechanism. one can prove the mechanism secure by making and proving the following assertions: 1. A capability gives one access to only the entity referenced by the capability. 2. Capabilities cannot be forged or fabricated, and their values can not be manipulated. Another protection advantage is thet, if one subverts the protection ‘mechanism in a conventional system (eg. tricks the operating system into putting one's program into supervisor state), the system becomes totally unprotected. A corresponding subvetsion in a capability system is getting 2 capability that one should not have. which compromises only the referenced object. not the entire system. Sharing. At the same time, by beginning with a storage model of a single set of objects, capability-based addressing eliminates the need for awkward mechanisms for sharing data and programs {31}. All one hes to do to make an object available to another program is to give the other program a capability to the object. Finer Granularity of Protection and Sharing. As was discussed earlier, ‘capability-based addressing allows programs to establish boundaries of protection and sharing in terms that are meaningful to the program. rather than having to work around inflexible fixed boundaries. Program A can compute a capability to variable X and give this capability to program B, giving B access to X and nothing else. Uniformity. By combining the often-disjoint concepts of protection and addressing, and by eliminating the need for piecemeal solutions such as privileged instruction states, capability-based addressing introduces a high degree of uniformity of concepts. The representation of information as objects allows one to desiga a system with a uniform concept of object, binding (e.g.. one does not need one concept for binding program modules together, another for binding a program to data, etc.). 2(CAPABILITY.BASED ADDRESSING Error Detection. When unique names are employed in capabilities, the machine is able to detect a programming error known as the dangling- reference problem. This problem occurs whenever the lifetime of a name ot address exceeds the lifetime of the entity referenced. As an example, ‘assume that the following PL/I statement is executed: P = ADDN: Its possible for X to disappear. yet for P to remain. Instances are where X is a local variable in a procedure and P is a returned parameter. where X logically disappears when the procedure terminates, or X is a dynamically allocated variable which is later freed. In current systems P is simply an address, and if P is used to reference storage after X disappears. one gets ‘an unpredictable result. In a capability system the value of P would be a capability: if X is destroyed and an attempt is made to use P, the address translation would fail and the error would be detected. ‘SINGLE-LEVEL STORAGE In current systems the programmer is faced with a nonuniform and discontiguous storage model, where differences in addressing, function, and data-retention characteristics are visible above the machine interface. For instance, a datum in main storage is addressed one way ( linear one-dimensional value). but a datum on the surface of a disk is addressed in a completely different way (¢.g., device number. track ‘number, record number, linear displacement with record). Basic functions (e.g.. addition, comparison, move copy) are not defined in common acroas the entire storage model, requiring thet the programmer explicitly move data inta the type of storage media in which the required function is defined. The programmer must also be aware that information in main storage is normally retained only until process termination, where information in secondary storage is normally retained until explicitly de- The implication is that the programmer, very often the application programmer. must be aware of this irregular storage model, and must explicitly move data among storage media (L.e., perform input/output operations) based on what addressing mechanisms, functions, and reten- tlon characteristics are desired. Aiso, because the main-storage model often has a limited storage capacity. the programmer must resort to different modia (e.g., files) as mechanism for passing data among programs. Hence one could accuse this environment as being @ con- trlbutor to the high cost of programming, ‘Another implication is that it is an obstacis to programming generality and modularity, as it increases the potential types of interfaces among programs. Today a program (or part of a program) can receive input in at least two ways: by receiving arguments (i.e, structured of unstructured data in the main-storage model) or by performing input operations data from another storage model, such #s a file). Because the large differences between the two models ate visible to the programmer. programs rarely accept both alternatives, If program (or module) A was written to receive its input as arguments, one cannot conveniently use A if (2) a large amount of input data must be processed (i.e.. the amount exceeds the main-storage capacity), of (2) one wishes to execute A independent of other programs (ie. the data to be processed is notREQUISITES FOR IMPROVED ARCHITECTURES: “contained” within another program, but has a life of its own). Likewise, if program B was written to receive its input as a fils, one cannot conveniently use it in such situations as (1) the date to be processed ‘currently reside in another program. and (2) the data to be processed reside in a file, but with different storage characteristics (e.g., lape versus disk) than those assumed by B. A solution to these problems, the single-level store, is a simple concept (simple in torms of architecture, although not necessarily so in terms of implementation). One unifies the storage model such that, at the machine interface, all forms of physical storage are represented by the same addressing mechanism, function set. and deta-retention characteristics. For instance, Figure 4.9 in the previous section, rather than being only a main-storage model, might represent the single-level store, Entities normally considered to be files now become objects within the single-level store and are addressed in the same manner as all other objects. If. for reasons of cost and speed, the system contains a storage hierarchy of media of various speeds and costs, the movement of data within the hierarchy Is now the responsibility of the machine implementation, not the programmer. ‘The concept is much like that of virtual storage, but with two substan- tial differences, First, virtual storage, as the term is normally used, doos nothing to unify the storage model: its usual purpose is simply to extend the size of the main-storage model. One can view the single-level store as fan extension of the virtual-storage concept to encompass all storage media. Since a linear storage model is somewhat awkward when doing this, a more desirable model is one of a sot of objects with capability-based addressing (Figure 4.9). ‘A second differonce between single-level storage and virtual storage is retention. In a virtual-storage system, the storage environment usually disappoars when the program terminates execution, or when the system is shut down. Such is not the case in a single-level store. Hence if a program has some data in an array object that it wishes to retain until tomorrow. it need not create a temporary file to hold the data: it simply retains the array in the store. Note that this could also apply to program objects. depending on how the architect defines “program load” in such an environment. For instance, static or own variebles might retain their values between executions of the program. This provides one with a simple way of remembering data from execution to execution, but also requires one to reconsider the definition of certain programming- language concepts. Advantages at the Single-Level Store Reduced Software Costs. A significant percentage of the cost of developing a program is consumed by complexities of input output, or each Programmer's management of the system's storage hierarchy. Although the single-level store does not necessarily remove all concepts of input! output from programs (e.g,, input/output with respect to CRT terminals, printers, analog-digital converters), it does allow one. if the notion is carried upward to the programming language. to eliminate s large fraction of input output complexity from programs and languages. For instance, rather than a program performing explicit read, write, get, and put ‘operations on files, the program simply views the file as an array or vector EsCAPABILITY-BASED ADDRESSING and operates on it as such. Also, as mentioned above, the single-level store enhances program modularity by providing a uniform environment in which data are transmitted. One may feel that modern data-base management packages in current systems achieve these advantages. but this is only partially the case. First, the argument has merit only when and where such packages exist (large-scale systems); they are usually absent in smaller scale systems (eg. minicomputers, microprocessors). Second, they do not completely shield the application programmer from the concept of input/output: the programmer is still faced with an interface that copies data to and from the data base. Third, data base systems do not encompass all types of files in use today. They are oriented toward large highly structured collections of data, where the structure of the data is relatively static, and are rather inefficient for small, dynamic. and/or unstructured collections of data. Technology Independence. An obvious characteristic of the single- level store is that the attributes of whatever devices are used to implement the storage hierarchy are hidden beneath the machine interface. One ‘implication is that programs become more portable (among systems) and tess dependent on a particular system configuration. Another implication is that the machine designer is free to select (and change) the underlying storage hierarchy. ‘The latter seoms particularly important in today's environment of rapidly changing memory technologies. Although, in general. the costs of all types of storage have been dropping and speeds have been increasing, the cost and speed ratios have not been constant, and memories with new characteristics are continually being introduced. Today the designer is faced with (1) a decrease in the cost differential between magnetic media (eg. disks) and semiconductor storage, (2) magnotic-bubble shift registers, having many of the properties of rotating disks, but with additional properties, such as being able to stop and start data movement with respect to the read/write mechanism, (3) extremely fast, low-power, and low-cost semiconductor RAMs, and (4) nonvolatile semiconductor RAM. The future is bound to bring further changes. Aa a result, an architectural concept that allows one, as technology changes, to change the number of levels and types of media in a storage hierarchy without affecting programs is attractive, Uniformity. In addition to unifying system concepts by giving all storage the same appearance. the single-level store applies existing architecture concepts. in particular those normally thought of in the context of main storage, throughout the system. For instance, if the architecture contains the notion of tagged storage, the notion is carried throughout the storage hierarchy (i.e. tagged storage now applies to what would have been considered as fies). Global Optimization. In a conventions) system. each program that is performing input/output operations is managing, from its own point of view, the system's storage hierarchy. Furthermore, it is likely that the processor is independently managing a cache, and the operating system is independently managing paging operations. Chances are that these local ‘and independent operations are far from optimal from the point af view of ‘overall system efficiency. Actions such as double buffering, which may be perfectly good local optimizations, are likely to be far from optimal in &REQUISITES FOR IMPROVED ARCHITECTURES global sense. Hence hiding the storage hierarchy from programs and managing it by one mechanism londs to the possibility af managing itin # more optimal way. As before. given that the mechaniam is hidden beneath the machine interface, the system designer has more degrees of freedom of implementation (e.g.. parallelism). Problema Crested by the Single-Level Store The single-level store also prevents the system designer with some new problems. Time will determine whether these problems can be solved, oF whether they are inherent disadvantages. Implementation of the Storage Hierarchy. The single-level store implies a storage hierarchy containing all data and programs in the system and probably consisting of many levels of storage of differing cost /speed characteristics. Currently there exists little experience in storage hierarchies, other than two- of three-level hierarchies for program storage (cache/RAM Ipaging disk) (36]. Considerable research is needed into the number of levels, strategies for data movement, the quantum of data movement (a.g., whole objects. parts of objects, fixed-size pages), “write- through” strategies (i.e, whether write operations should be reflected back into the permanent location of the object), media volatility, and considerations in a multiple-processor or distributed-processor environment. . Porenoss of the Storage Model. Although hiding all characteristics of the storage hierarchy (even hiding the existence of a hiererchy) is desirable for reasons described earlier, it is questionable whether this is practical in all environments, or whether certain programs need mechanism to make local optimizations. Mechanisms that come to mind are: 1. A way for a program to suggest recommend mandate that a set of ‘objects should be moved through the hierarchy together because of their usage patterns. 2. Away fora program to suggest recommend /mandate that » particular ‘object be held. for a designated time period. in the highest level in the hierarchy. 3. Ways for a program to suggest recommend /mandate the movement of ‘objects through the hierarchy. Examples are: “I am about ready to start using this object” and “I am done for now using this object.” ‘The challenges are (1} determining which mechanisms are sufficient and (2) determining how to represent them in the architecture in a way that is. ‘as implementation independent as possible. Recovery. Given that memory failures (e.g., head crashes, oxide deterio- ration, alpha-pesticle radiation) seem likely to persist. the hiding of the storage media end hierarchy by the single-level store would seem to ‘complicate the recovery process. The primary reason is that recovery in today's systems seems to require human intelligence, that is, understanding the mapping of information to physical locations (e.g.. knowing what files were stored on what disk tracks) and understanding the information ”(e-g., segment. page) self-identifying (so that if the system's directories of information within the htersrchy are damaged, they can be re- constructed), and providing a human service interface to the storage hierarchy. Ubject Portability. The single-level store implies a closed system, or one where objects are not transported among systems. Such issues as how one takes an object from system A and transfers it to system B's single-level store need consideration. In today's world one can cause the object to be placed on a specific disk pack or diskette. In a single-level store one does not know where the object is, and it is possible that pieces of the object will be distributed among many different storage devices. If capability-based addressing is used, it also raises some portability considerations. If an object contains capabilities, the capabilities are meaningless if the object is transported to another system, SourceiSink InputiOutput. Although the single-level store model may climinate traditional concepts of file input output, the model is not suited to (1) input/output to memoryless devices and (2) input/output to devices ‘or media for which random or direct access is not appropriate. Memary- less, or sourceisink, input/output devices are such units as CRT screens, keyboards, card readers, printers, communication lines, and sensors. Although it is possible to represent them in the storage model (the concept of memory-mapped input/output in certain microprocessor sys- toms), such representations are not straightforward and can be a source of ambiguities. Likewise, although a magnetic tape is a storage device, representing tapes as part of the single-level store is not appropriate, since they are sequential-access devices. Hence the single-level store does not necessarily eliminate all concepts of input/output from the architecture. ‘The Single-Level Store in the IM Systern/38. ‘Since the IBM System 38 (60] appears to be the first commercial system to contain the single-level store concept as discussed above, it is used here as an illustration. Since the system has an unorthodox structure, the discussion must start with an overview of its structure. Current implementations of the System/38 have two levels of architecture as shown in Figure 4.11. The outer level, called the Ml architecture. is the only level visible to the programmer, and it is the level to which programs are compiled. However. it is not an interpretive level. When programs are loaded, they are translated by internal software to a lower level interpretive architecture called IMPL. beneath which sits a microprogrammed machine, However. the IMPL architecture cannot be seen directly by the programmer; thus the manufacturer is free to reposition the IMPL level in future implementations. Although the MI interface is not interpretive, the intervening software gives the programmer the impres- sion that itis e.g.. program errors and state are reported in terms of the MI interface). Note that this unusual structure complicates the definition of the computer architecture of the system. Strictly speaking. the IMPL. interface is the computer architecture and the MI interface is a software architec-REQUISITES FOR IMPROVED ARCHITECTURES ture, However, because the MI interface is the only level defined to the ‘user, because it is the target of the compilers, and because the system gives one the appearance of executing programs at this level, it could also bbe called the computer architecture of the System/38. . ‘The Mi intecface presents a storage model similar to that of Figure 4.9. It is an object-oriented madei using capabilities for addressing and presenting a singie-level store. Special considerations in the Mi interface for the single-level store are: 1. One type af abject, the access group, is a “super object” containing fone or more other types of objects. [ts motivation is control of movement through the storage hierarchy. If desired. one would place, fn an access group. objects that will be referenced at the same time (eg, all objects comprising a particular program). The effect Is that ‘when one object in an access group is referenced. the other objects in the access group will tend to be moved into higher speed storage at the same me. 2. An explicit instruction, SET-ACCESS-STATE, exists to allow a pro- ‘gram to request that an object be maved into main storage, oF from main storage. The transfer of the object can be specified to be synchronous (execution of instructions in the process does not ‘continue until the request is satisfied) or asynchronous. 3, The ENSURE-OBJECT instruction allows a program to force the system to preserve a current copy of an abject in a less volatile part of the storage hierarchy. Certain changes to certain types of objects ‘cause the system to perform an “automatic ensure.” Also, several y" option, which directs the ta the backup copy of the system object ia the hierarchy). 4. When an object is created. the creating instruction can specify " clase" information about the object. One can designste the amount of the object to be transferred into main storage when & reference is made (currently a $12- or 4096-byte section). 5. The MI interface contains instructions for the dumping of objects onto removable media {diskettes and magnetic tapes) that are not part of the single-level store, and for loading objects into the single-level store from these media. 8 Considerable software exists beneath the Mi interface to perform ‘object recovery after an abnorma) system termination. When the system is restarted. it builds a recovery list, indicating the objects that ‘were, or might have been, damaged. If an object is determined to be damaged, most operations on the object are prohibited to preserve system integrity. The MI interface also contains a RECLAIM instruc. tion, which will free any space thet is not associated with a valid abject and return a list of “last” objects (dangling objects that appear ta be owned by no one). Figure 4.11 Layered structure ot he Syaterv38,SNGLE-LEVEL STORAGE 7. The Mi interface contains a separate set of functions for sourca/sink input output. At the IMPL interface. the single-level store is implemented using virtual-storage and paging concepts. The IMPL machine has a conventional von Neumann architecture, although with some unconventional characteristics. One is a large addressing range. Its address size is 48 bits. giving one a linear virtual memory of 2, or 261-trillion, bytes. ‘The virtual storage is large enough to encompass all storage within the system. Hence the single-level store is managed in a single, 2*-byte virtual storage, using @ two-level storage hierarchy (RAM and disk storage). Data movement occurs in fixed-size 512-byte pages. The 48-bit virtual address is viewed by tha machine a bong 36-0 pag dre and a Sbit offset withit ‘The address-translation hardware Converts the first 39 bits to an eddreas of a 512-byte page frame in main ‘storage (or generates an interrupt if the page is not in main storage) and ‘appends the &-bit offset thus forming a main-storage address. Because the address is so large, conventional linear page tables are not used; hash tables are used instead. The movement of data through the two-level hierarchy is handled by software executing upon the IMPL architecture in much the same way ‘that paging operations are performed in other systema. Objects are allocated in segments in the 2*-byte virtual address space. ‘The address space is managed using both 64K-byte segments (for small objects) and 16M-byte segments (for large objects). The software maintains a number of directories for associating virtual addresses with disk locations. To assist the recavery process (¢ In the case of damaged directories) all pages on disk storage self-identity themselves by containing their virtual address as a prefix. ‘At the MI interface, capabilities have a size of 16 bytes. Six of these bytes are used to hold the virtual address of the entity referenced by the capability. ‘The programming languages for the System/38 are RPG and Cobol, and no changes were made to the languages as a result of the single-level store. As a result, the concept is not visible to application programmers: ‘an application program performs input/output operations as it would do con other systems, Thus the advantages of the single-level store are not available to users of the system, although they proved to be of significant benefit to IBM in the development of its system software. ‘As mentioned in Chapter 2, one can obsarve a large semantic gap between modern operating systems and the underlying machine architectures. ‘Operating systems have evolved into large. extremely complex programs. ‘The design and development of the operating system is often the bottleneck in the development of new systems, and the operating system fs often the speed bottleneck in current systems. Given this, a partial solution is reducing the semantic gap between the operating system and machine by providing direct support in the ‘machine for traditional operating-system functions. As with other situations, one must consider where to draw the fine. One possibility is to categorize first operating-system functions into those that implement policies and those that implement mechanisms. Typical policy-relatedy cures rn rhove ARETECTRES functions are cost accounting, job queusing. and user identification. ‘Typical mechanism-related functions are process management. informa- > tion protection. and storage management. Since policies tend to vary in different environments, often need to be modified by the system user. and are built atop mechanisms, a reasonable approach is to provide machine support for the low-level mechanisms and avoid casting policies into hardware, ‘The low-level operating-system mechanisms that seem to be candidates for architecture support are: 1. Process management Switching of the processor(s) among processes. Creation and destruction of processes. Synchronization of processes. Communication between processes. 2. Memory management Allocation of storage space. Implicit movement of information across a storage hierarchy (0.g.. paging). -Explicit movement of information across a storage hierarchy (¢ file intput/output). 3. Protection -Protection of programs and data from one another. -Means foe transfarring protection rights. Since two concepts discussed earlier. subroutine management and the single-level stare, largely provide item Z, and capability-based addressing provides support for item 3, this section will focus on item 1, process ‘management. ‘A process (or task) is the sequential execution of a stream of instructions (L.e., without concurrency). Mast modern operating systems provide an environment where multiple processes are executed concurrently. (If there is only one hardware processor, “concurrently” is an abstraction created by sharing or time-slicing the processor among the processes.) ‘The major process-management functions in operating systems are discussed below. Process Switching. Since the number of active processes is rarely equal to, and normally exceeds, the number of hardware processors, 2 mechanism is needed to share the processors. One normally provides a means to switch execution to another process when (1) the current ‘process reaches a point where execution must be suspended (i.e., some type of waiting state) (2) a suspended process of higher priority has left Its waiting state, of (3) the current process has been executing for a long period of time [i . tg guarantee some visible rate of progress for all Processes), This mechanism requires that a decision be made as to which is the best “other process” to execute. These decisions are often besed on strategies of priority scheduling (giving each process a priority. and favoring the ones of higher priority} or deadline scheduiing (gi each process a deadline as to when execution must be finished, then scheduling processes with these deadlines in mind). However, since these aSUNGLELEVEL STORAGE strategies are policies and are ofien modified at individuel system installations, ona should avoid incorparating 2 specific strategy in the machine architecture, although the mechanisms should be present to allow one to embody these strategies in software. ‘Process Creation. Few system environments are such that a fixed set of processes is adequate. Rather. processes normally need a mechanism to Greate and destroy other procasses, and to transfer data and/or addressa- billty when creating « process. Process control. Mechanisms are usually needed to allow one process 10 control the progress of another. Typical mechanisms are SUSPEND {stop execution of, but not destroy, another process) and RESUME (allow ‘execution of a suspended process o continue). Mechanisms such as these also prove useful in implementing process-echeduling strategies in Synchronization and Communication. ‘These are treated together here ‘because the same mechanism sometimes suffices for both. Process syn~ chronization mechanisms are needed when multiple proceises wish t0 tuse a resource simultaneously, and the resource is such that it should be ‘used serially. Process coramunication mechanisms are needed to transfer

Myers

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Myers

Uploaded by

Copyright:

Available Formats

You might also like