Forty Five Years of Computer Architecture—All That’s Old is New Again
Harvey G. Cragon The University of Texas at Austin cragon@uts.cc.utexas.edu
Abstract Even today we see a new application of caching
applied to internet service to improve its performance. Reviewing the last 45 years of computer design, I find that Frequently referenced pages are cached, sometimes most of the performance-improvement advances in locally; an example of temporal locality. computer microarchitecture have been based on the Spatial locality plays a key role in interleaved exploitation of only two ideas: locality and pipelining. memory, DRAM architectures, and disk arrays to These two concepts will be discussed, thereby illustrating overcome their basic long latencies. In addition, their primacy in performance enhancement. superscalar implementations depend on the spatial relationship between instructions in order to issue more than one instruction in a clock. 1. Introduction The implementation of the Institute for Advanced 3. Pipelining Studies (IAS) architecture primarily followed the serial execution model; an instruction is completed before Early computer designers adopted ad-hoc concurrency another instruction begins [1]. The exception was that two implementations such as fetching the next instruction instructions were packed into one memory word and were while the current instruction is still being executed. These fetched from memory at the same time; a performance ad-hoc steps yielded incremental improvements in benefit possible only because of instruction spatial performance along with excessively complicated designs. locality. Generalized pipelining eliminated the ad-hoc overlap In the 1950s and early 1960s, gains in processor design procedure [4]. Pipelining was first applied to high performance depended primarily on increases in raw performance supercomputers and is today the design circuit and memory speed (such as transistors and random paradigm for most microarchitecture. access core memory). Since that time, there are two The benefits of pipelining are: a faster clock and a implementation ideas that have dominated techniques for theoretical execution rate of one instruction per clock, for increasing the performance of computers: locality and degree one superscalar, while serial execution model pipelining. processors have an execution rate of less than 0.1 instructions per clock. However, the high theoretical 2. Locality performance rate of a pipeline can be greatly diminished because of structural hazards, data hazards and branch During the 1960s, locality was first employed to stalls. improve both the performance and usefulness of Solving structural hazard problems generally requires computers. In addition, locality provides a powerful tool replication of the resource leading to possible coherency for overcoming some of the performance problems problems. Split caches are an example of a replicate resulting from pipelining, the other primary concept. resource that themselves depend upon locality to function. The ideal memory—infinite size, zero latency, and Data hazards introduce stalls that can be mitigated by zero cost—can be approximated because of temporal and out of order instruction issue. Calling upon locality, spatial locality. Caches [2] and virtual memory [3], along designers implement reorder buffers to insure in order with translation lookaside buffers (TLBs), would not be updating of the processor state. design options without temporal and spatial locality. If all Branch stalls reduce pipeline performance leading to instruction and data references were uniformly distributed the need for branch prediction strategies and hardware. over the address space, we would be dependent on raw Many prediction strategies depend on temporal locality memory speed alone for performance employed with branch target buffers and branch target caches to benefit from the temporal locality of branch direction information . 3. Conclusion Can you imagine what the microarchitecture of a modern processor would look like without locality and pipelining? Would we have the performance of today based only upon advances in circuit and memory technology? I think not. Upon these two concepts hang most of the advances in microarchitecture of the past 45 years.
[1] A.W. Burks, H.H Goldstine, J von Neumann, Preliminary
Discussions of the Logical Design of and Electronic Computing Instrument, US Army Ordnance Department Report, 1946.
[2] F.F. Lee “Study of “Look-aside” Memory”, IEEE
Transactions on Computer, 18: 11, November 1969m PP 1062- 1064.
[3] T. Kilburn, D.B.G. Edwards, M.J. Lanigan, F.H.
Summer, “One-level Storage System”, IRE Transactions on Electronic Computers, 11: 2, February 1962, pp. 223- 235.
[4] L.W. Cotton, Circuit Implementation of High-speed
Pipelined Systems”, Proceedings Fall Joint Computer Conference, AFIPS, Vol. 27, 1965, pp. 489-504.
Mastering IoT For Industrial Environments: Unlock the IoT Landscape for Industrial Environments with Industry 4.0, Covering Architecture, Protocols like MQTT, and Advancements with ESP-IDF