You are on page 1of 26
INTRODUCTION This paper descrives the architecture of the CPU and Memory for the Central Air Daca Computer (CADC) System used in the Grumman/ Navy Fl4A carrier-baseé fighter airerait, The CADG performs spectalized computeretional functions in response to input stimuli such as pressure sensors, teuperature sensors and closed Loop feedback inputs. Outputs from the CADC system are used to drive pilot visual displays (euch as, altimeter, temperature indicator, nach aumber indicator, etc.) and to provide control inputs for other aircraft systems. Tne outputs from the CADC are in the form of digital and analog signals. Figure i illustrates a block diagram for the CADC. Being in a flight envirorment meant that certain constraints must greatly refleet the architecture of the CPU and Memory. These constraints were size, power, real-time computing capability and cost, not necessarily in that order, Other constraints such ag temperature, acceleration and mechanical shock affected the overall design of the CADC, The size of the CPU-Memory wes Limited to a maximum of 40 square inches. This included the arithmetic section, read-only memory, and read/write memory. Since the unit was to be packaged oa a printed circuit card che mmber of layers of che p.c. card was an important consideration, The power consumpzicn had a Limir of 10 watts at anbient 25°C, This was principally a function of the capabilities of the p.c, card to withstand the heat. The required computing capacity for the CPU was not defined at the beginning. This meant that the system had to be somewhat Flexible to changes in computational load. Of course limits had to be set to be able to work within the other constraints, what was known abost the compueation was the form of the equations to Ye implemented, This included polynominal evaluations, data limit- ing, data comparison and discrete or flag inputs and outputs. This meant that the arithmetic and logical functions of the system had to handle at least the following operations: maleiply Divide Add Subtract Limits Square Root And or Conditional Transfer Unconditional Transfer Receive Diserete Data Receive Digital pata Output Discrete Data Output Digital Data The Last constraint ef cost was certainly important since the eystem would eventually go into high volume production. FUNCTIONS TO BE IMPLEMENTED Before we proceed, a better understanding of the functions to be implemented is necessary. The function that most often occurred was the polynomial, POD = aga? + age’ + aye! + age? + age? + aye + where x was the input, either from outside the CPU (sensors) or from its own memory. In order to save arithmetic in ite "nested" form as follows: time the polynomial was implemented PO) = (((C(Ceag) + a5) x + By) K+ ag) x4 BQ) eH aL) KF OQ ‘The data limit function was one that would accept 2 binary inputs, an upper Limit (J), @ lower Limic (L) and a parameter (P). The output would then be as Pifo Lite uae? follows; 2 P2L ek >. NALSAS sVOS-NO FHLO 8. Wwiasia Old OL S¥o.voIONI S1NALNO SOTWNY SLNdLN9. TWwLlsIa sindno. aiayosia ONIHOLINOA 7 As3l 4713s “4 -1 amid | SLYAANOO 7WEISIG OL 90TVNV aNY fd ¢OSNSS: ["] aunssaud ‘OINVYNAG 4OSNSS aunssaud laws SWBLSAS ‘uvO8 -NO SINAN LOTId SLNdNI Asal 3735 SLNANI LOTId 7 SLMdNI i woveoaas 30d0ud aunwaadWaL Laveouly Even though such a function could be programed in software, it was decided to build it in hardware since it was used often enough. pata conditioning and scaling also had to be accomplished. This involved the following simple expressions ates ole Again the occurance of these were frequent enough to warrent hardware consideration. This will becowe apparent later when the hardware is discussed. Gince size and power consumption was of the ultinate importance, DS technology was chosen as the means of circuit implementation. his ellowed greater packaging densities to be obtained that otherwise would not be, Tae slowness of MOS devices and the high thresholds used allowed a design that was virtually immune to Glectrical noise on the ground or transnitted from packages withla close proximity. The higher supply voltages required resuleed in a more efficient power supply design. NUMBER SYSTEM he CPU is @ fractional fixed point machine with the most significant bit a sign bit and the cther bits representing data. Negative nuabers are represented in two's complement notation. ‘two's complementation was chosen te avoid the ambiguity of double zeros. The word length chosen for the system was 20 bicss 19 bits of data and 1 bit for sign, Tais Length was chosen after a Chorough analysis of the accuracy required for certain throughput calculation: such as the rate of change of altitude function. Early in the architecture study it was realized that package size and quantity should be kept to a minimum if we were to meet the size constraints established. With minimum packaging space requirements it was necessary to use packages with the fewest possible leads. This would minimize the complex p.c. card interconnect which was inevitable. Because of this the processor was designed to transfer data serially througsout the entire system, PROCESSOR PARALELLISM As is known by all computer designers, serial machines are usually not the best way to go if computational speed is needed. To get around this it was decided to have several arichmetic or processing units working at the same tine, This resulted in @ technique known as "pipeline processing" or "pipeline concurrency As defined by Bell and Newell! "pipeline concurzency is the name given to a system of multiple functional units, each of which is responsible for partial interpretational and execution of the instruction stream." This system uses multiple functional units each dedicated for 2 specific task, These functional units are: 1, Parallel Muléiplier Unit (PMD) 2. Parallel Divider Unit (PDU) 3. Special Logic Function (SLF) 4, Data Steering Unit (Si) 5, Random Access Storage (Read/Write) (RAS) 6. Read-only Memory UnLt (ROM) Figure 2 shows a block diagram of the functional units they would work together in a typical system. Each unit was designed to operate as a separate entity and could be used without the need of any of the other unite, This was done to provide maximum expandibi lity with minimun additional hardvare, Fach functional unit is controlled by its own microinstruction ROM, The miero- instructions are also transferred serially to minimize package pin count, Teaporary data storage is provided in the form of read/write memory. I, Gordon ¢. Bell, and Allen Newell, Computer Structures: Readings ‘and Examples, McGraw W111 Book Co., N.Ycy + PB. pouiwos FMunrs29 av Olt Quimws ANIME HOO LH & aR SHivd Viva SHLVd TONLNOD ---~ (YOSN3S YO a/v) SVN Wes SLNANI WIIG gunn, om WoL on +4 Ht l.. 21 NoWONUAsNl |02-. a1naow anv viva [Sil = g1nd0w o 3 = Q Ines 8 NoULo 2 aindow ‘ONY. Vive 5 Z 31n00w = 6-2 NdLn0, 8 sila qwutgIg 2 [*70yLNoo 3 _ g | ineino ‘SWwOu c N-O1'9-b NOLLOMELSNI | 91-81 Siig ‘ONY vit sia “OYLNOD 4 -31nGow oy Touan Before looking at the functions of each of these units a brief look at the timing is needed. Figure 3 illuserates thie timing, The CPU-Memory clock is 375KR2. One complete clock period, defined as a bit time 1s 2.66ysec, Every 20 consecutive bit times are defined as a word, The first bit time of a word is called 10, and the last time of the word is called T19, Two types of words are used in the system, W, and Wy. In Wy, the arithmetical algorithms operate and instruction words are shifted serially into each functional unit. In Wy, computational inputs and out- puts are shifted serially among the units, A word mark used to distinguish word times is a signal coincident with T18 of every word time, Two consecutive word times, Wy and Wp, is called an operation (op) time. To distinguish the final operation time a frame mark is generated in the system executive control, The time between frame marks is called a frame, A frame includes one complete cycle of computations. The frame mark is miczo- programmed to allew the user to restart the computational cycle when all previous computations are complete, Since this system mist operate in real time it was therefore necessary to obtain the most computation from each functional unit during each frame time, ARITHMETIC UNTTS The Parallel Multiplier Unit accepts two serial inputs, multiplicand end multiplier, in one word time (Wp) and produces their properly rounded product by means of @ parellel algorithm in one more word time. he product is shifted out in the next Wp, while inputs for the next operation aze sfaultaneously shifted in, The miltiplication operation is achieved using Rooth's algorithm”, The PMU does not need en instruction word to operate, but Is capable of operating continuously in this manner. The Parallel Divider Unit accepts two serial inputs, dividend and divisor, in one word time (Wg) and produces the proper quotient 2. Yoahan Cha, Digital Computer’ Design Fundamentals, McGrawHill Book Co., NeYe, 1962, pg. 326 wep ue ni € Bonga n (OaWNVEOONDONSIW) WW SWVEs wSISNVEL ws NOLLONYLSNI NTT sv ov om by means of @ parallel algorithm in one wore word time (#,). The quotient 1s shifted out in the next Wp, while inputs for the next operation are simltancously shifted in, The division operation 49 achieved by using a non-restoring division algorithm.” An actual photograpa of the PDU chip is shown in Figure 4. ‘the Special Logic Function performs Logical operations and generates specific data and Logic outputs, The unit accepts an instruction word which specifies details of the operation, ‘The fundamental logical operation of this unit Ls the limit function. If consists of three registers (U,P, and L) whose inpues arrive in Wy. One of these regiscers Is picked at the end of Wg ‘vy ascociated comparison logic, Other logic functions such as ‘AND's, OR's, GRAY CODE (special), Conditional, and Unconditional data branching is also included in this Logic. ‘Tho Data Steering Unit operates as a three channel serial digital data multiplexer. Informatica is shifted serially through the device during wy. A 15 bit instructlon word is accepted Garing Wi, that spectftes which input or input conbinetions (Add or subtract) is to be "steered" to each of three data outputs. The instruction word for this unit is the last 15 bits of the 20-bit instruction word, From the least significant end, the First four bits specifies the selection for Oucput 1, The next four bits specifies the selection for Output 2 and the last seven bits specifies the selection for Output 3. Addition or subtraction is perforned by specifying that output combination to be “steered” to the output. By performing additions and subtractions in this manner the programmer can ebtain the sum and difference during the same word time that the date is being transferred, This transfer way be either to or from the memory or arithmetic units, Specific instruction codes are interpreted as follows Yoahan Chu, Digital Computer Design Fundamentals, MeGrawHill Book Co., Ne¥., 1962, pe. 39- 6 i 8 9 0 0 0 0 ° o 0 1 0 0 1 o 9 a 1 1 0 L o 0 0 1 0 1 0 1 L o 0 1 L 1 1 0 o 0 1 ° 0 1 L ° L a 1 0 L 1 1 1 ° 0 1 1 ° 1 L 1 L 9 1 1 L 1 wo 0 0 0 0 ° 0 o L 9 0 L 0 0 0 L 1 0 1 ° 0 0 1 ° L ° 1 1 oe 0 L 1 1 1 0 0 0 1 0 o L 1 0 1 ° 1 ° 1 1 1 L 0 0 1 1 0 1 1 1 1 o 1 1 1 L Selected to Output 1 ExT ExT EXT ext ExT ext EXT EXT ext ExT EXT EXT EXT. EXT. ExT EXT Selected co Output 2 ExT EXT ExT ExT xT xr ext EXT ExT EXT O,LLLLTLL2LLLLLL1L1 (PTS) (Maxémam Positive Number) ext EXT EXT EXD EXT 2 a 4 5 6 7 8 9 10 13 ree 9+ EXT 4 10 + Ext 4 44 ENT 8 2 = EXT B 1 ev sHeEN 9 19 AL 9 + EXT & 10 + EXT 4 4 + EXT B 2 > Ext 8 crs HEE HHH HH ec eeoeoee 20 io 1516 a 0 0 0 o 4 0 L 1 0 1 ° 1 L 1 L ° ° 0 o 0 1 0 L 1 ° L 0 1 L zu 1 EXT 12 + o 1 Sy o sy 1 ogy ERT ExT ExT ExT Ext senor eHarnororore a EXT 8-5, + ext 2-2, EXT 2-29 EXT 4 = £5 EXT 4 = E> 1 ww aren Ww Ra Selected to Output 3 (controlled by bit 18)

You might also like