The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

Memory Management in Modern Computers
• One of the biggest problems facing modern computer designers is
that of providing large amounts of high speed memory.
• This is a problem that has evolved over the last 25-30 years.
• Earlier in the history of computing, most processors were
relatively slow compared to the speed of available memories
(Except for bulk storage mechanical memories, i.e., disks and
drums).
• Especially in the early days of the personal computer, the CPU
was relatively slow compared to early electronic memories.
• The speed of random-access memory was not an issue; the biggest
problem was just getting enough memory, period (early PC’s with
large memories had 256-512 Kbytes)!
1

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

Relative Speeds of the CPU and DRAM
• Over the last two+ decades, central processor chips have caught up
with and passed DRAM speed dramatically.
• Example: current CPU speed is 3-4.5 GHz, depending on the
processor type, and should increase somewhat, although
manufacturers are now abandoning the “speed race” in favor of
multiple processors.
• On the other hand, practical bus speed for CPU memory is about 11.8 GHz currently, and this is for “high performance” memory;
“common” bus speeds are still no more than half that.
• The CPU performance edge over memory is on the order of 3-4, and
much more than that on systems with the more common bus speeds.
2

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

The Memory Speed/Cost Dilemma
• There are further problem facing the modern computer
designer:
– Users need very high memory speeds to improve performance
(for example, in graphical computing, games, video editing).
– At the same time there is also great demand for maximum
memory capacity by many users (PC’s do not just manipulate text
any more; complex graphics, video games, movie editing and
animation all require enormous amounts of both DRAM and bulk
storage (hard drives [HDD’s]).

• However, there is a conflict in these requirements:
– Fast memories are very expensive.
– High-capacity, cheap memories (esp. HDD’s) are very slow.
3

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

B. Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Stating the Memory Management Problem • The computer designer of today is therefore faced with a problem that is not easy to solve: – There must be enough high-speed memory available to avoid slowing down the processing rate of current CPU’s. – There must be sufficient DRAM to avoid the deadly “disk access” (i. having to go to the HDD to get program or data material). 4 Lecture # 21: Memory Management in Modern Computers © N. since HDD access is very slow.e. and accessing this memory and transferring it to DRAM/other memory must be as painless as possible. – There must be enough bulk memory (HDD) for all storage needs.. – The cost must be reasonable. at least very often.

~0. – Slower memories such as CD’s and DVD’s for long-term storage. slow but cheap) holds complete programs and “near-term” archives. – Bulk storage memory (disk drives. ~ 64 Kbytes.3-2+ TByte. B.” up to 64 Gbytes. Dodge 9/15 . called cache.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Solving the Problem of Memory Management • The current approach to memory management : – CPU has a large register complement. fast) provides capacity for programs currently in process. ~ 15 Mbyte. • L3 cache – Center of CPU chip. which allows more data in the CPU at a time and improves performance. There are two kinds: • L1/L2 cache – On CPU chip. adjacent to ALU. very fast. 5 Lecture # 21: Memory Management in Modern Computers © N. very fast. – High-speed electronic memory (“DRAM. – Very-high-speed D flip-flop arrays. hold currently executing program segments.

DRAM. thumb drives (flash EPROM). cache. the use of sophisticated memory management is common. even in the everyday PC. and the disk or HDD (or SSM). DVD’s. Dodge 9/15 . invisible to the user. B.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Types of Memory • As we have just seen. cheap memory available to solve any problem. 6 Lecture # 21: Memory Management in Modern Computers © N. Zip drives. or floppy disks! • The challenge to the computer engineer is to mesh the first five storage media and to make the use of them “transparent” – that is. • Before we discuss how to manage this extremely challenging engineering problem. we will discuss the types of memory that are used and learn a little about them. • This means that there are four kinds of memory in the modern PC or workstation computer: Registers. who will appear to have massive amounts of high-speed. And this does not count CD’s.

so their speed is basically that of the CPU (in fact. Dodge 9/15 . B. they determine ALU speed). Register Block • We already know that registers are simply collections of D FF’s.g. the R-2000’s 32). • Registers are inside the CPU. 7 Lecture # 21: Memory Management in Modern Computers © N. (e. • Most CPU’s today contain many registers.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Registers D Q CR D FF 32-Bit Reg. adjacent to the ALU.

• These memories are referred to as “random-access” because the entire array of memory is immediately available to be used. 8 Lecture # 21: Memory Management in Modern Computers © N. any single byte in the memory may be loaded or stored (“randomly accessed”) in the same amount of time. though very fast. B. • SRAM is used in what are referred to as caches – small. Both SRAM and DRAM are used in modern computers such as the PC. • There are two primary types of RAM: Static RAM (SRAM). but because it is inexpensive. it is the primary memory in most personal computing systems.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Random-Access Electronic Memory • Random-access memories (RAM) make up the “working memory” of most computers. and dynamic RAM (DRAM). very-highspeed memories that are physically close to the CPU. Dodge 9/15 . is slower than SRAM. • DRAM.

Dodge 9/15 .000 flip-flops. B. • Access speed of L1 cache is slower. has 64 Kbytes (32K instruction. 9 Lecture # 21: Memory Management in Modern Computers © N. it is next to the ALU in most processors.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science L1 Cache • L1 cache (“level 1 cache”) is SRAM memory that is very close to the CPU. due to the complex arrangement of data buses which is necessary to access specific bytes in the L1 memory array. For example. It is typically about 1/2-1/3 as fast as CPU registers in terms of load/store cycle. • L1 cache is basically sets of D FF’s – but many more than in the CPU register block. 32K data)– the equivalent of ~ 500. • For example. on the other hand. a typical register block might have 16-32 registers of 4 or 8 bytes each for a total of 64-128 bytes of storage. The Intel L1 Core i7 cache. however.

L1 Cache • As you saw on the previous slide. Dodge 9/15 . B. Modern computer chips have separate instruction and data caches! 10 Lecture # 21: Memory Management in Modern Computers © N. in terms of memory structure.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science L1 Cache (Continued) D Q CR D FF 32-Bit Reg. cache has regressed.

• Due to even more elaborate bus arrangements and the fact that L2 cache is not as close to the CPU. since more “real estate” is devoted to memory.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science L2 Cache • The level-2 cache is a bit farther away on the chip. (L2 is also SRAM). load/store access is > L1 cache. The Intel Core i7 has 1 Mbyte cache. Dodge 9/15 . • L2 cache is much larger. B. 11 Lecture # 21: Memory Management in Modern Computers © N.

The reason is that the L3 cache is yet even farther away from the CPU. though much faster than DRAM. Dodge 9/15 . as shown in the picture of the Core i7 chip (upcoming). which is typically 8-12 Mbyte.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science L3 Cache • In modern multicore processors. • As L1 cache is slower than the register block. L3 cache is slower still. though still on the chip. the CPU’s are typically clustered around the L3 cache. 12 Lecture # 21: Memory Management in Modern Computers © N. the cores share the L3 cache. • To minimize the degradation in memory speed. B. and L2 is slower than L1.

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Intel Core I7 Cache Structure 13 Lecture # 21: Memory Management in Modern Computers © N. B. Dodge 9/15 .

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Processor Layout of Single Intel Core i7 CPU 14 Lecture # 21: Memory Management in Modern Computers © N. B. Dodge 9/15 .

Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Cache Location on Intel CPU 15 Lecture # 21: Memory Management in Modern Computers © N. B.

a DRAM cell uses only one). DRAM memory is an excellent compromise solution to fast storage problems. why isn’t all computer memory fast cache? • Answer: Cache memory has two major problems: – It consumes huge amounts of power compared to DRAM memory (a flip-flop has about sixteen transistors. cache is much more expensive than DRAM (5:1 or more). B. the cost of a computer (think PC) would go up dramatically. – This means if more cache were used. • For that reason. 16 Lecture # 21: Memory Management in Modern Computers © N. due to the cost of extra power to run it.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Why Not More Cache? • The question arises: If cache memory is so great. and cost of cooling the computer! – Also. Dodge 9/15 .

B. ~16 transistors per storage cell High Speed Fast Low.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Comparison of SRAM and DRAM SRAM Parameter DRAM Very fast High. Dodge 9/15 . 1 transistor per cell Very low Virtually none Very low Excessive High 17 Complexity Power Used Heat Generated Cost Lecture # 21: Memory Management in Modern Computers © N.

with a switch to store/test the charge. 18 Lecture # 21: Memory Management in Modern Computers © N. • The simple construction of DRAM makes it ideal in modern. This means that the title above is actually redundant! • DRAM is electronic memory that is capable of very fast access (load or store). One exception is “Rambus” memory.2 GHz!).). where most users have their own computer system (PC. Only a single transistor is required for a DRAM bit cell.” not “dram”). Sun. • DRAM consists of a simple charge-storage device (stored charge = “1”). etc. workstation-based computing. a special DRAM memory whose manufacturer has announced cache-speed products (up to 7. Dodge 9/15 . It is very expensive. Mac. B. but is not as fast as cache. however.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science DRAM Memory • The term DRAM stands for “dynamic random-access memory” (pronounced “D-ram.

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science DRAM (Continued) • The term “dynamic” in DRAM is due to the fact that the memory is not truly a flip-flop. Dodge 9/15 . Thus the DRAM element is “dynamic” – its memory lifetime is limited and it must have its memory refreshed periodically. 19 Lecture # 21: Memory Management in Modern Computers © N. however. • Capacitors. B. • On the next several slides. we explore the way DRAM is constructed and the odd way that it must be treated to be sure that it retains its memory. it is not static. are not perfect storage elements – the charge leaks off after a short time. DRAM “remembers” a 1 by storing charge on a capacitor.

Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas DRAM Memory Cell Construction Bit line CMOS transistor Word line Capacitor Ground • The DRAM cell is quite simple. Wires connect two terminals of the transistor to lines that can apply voltage. B. consisting of a single CMOS transistor and a capacitor. Dodge 9/15 . 20 Lecture # 21: Memory Management in Modern Computers © N. which can store electronic charge. • The capacitor is grounded on one end.

a voltage is applied to the word line. current flows into the capacitor and charges it. if 0 volts (“ground”) is applied to the bit line. B.” Lecture # 21: Memory Management in Modern Computers © N.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas DRAM Cell Operation Bit line +V (= logic “1”) Bit line Current +V (= logic “1”) turns on transistor Word line CMOS transistor + Capacitor charges Current +V (= logic “1”) turns on transistor Word line CMOS transistor 0 Ground To “write logic 1 data” to a DRAM cell. creating a “logic 1. Now. which turns the transistor on (once again. like an “electronic switch”). Dodge 9/15 . which turns the transistor on (it is like an “electronic switch”). a voltage is applied to the word line.” 21 0V (= logic “0”) Capacitor discharges Ground To “write logic 0 data” to a DRAM cell. If a voltage V is applied to the bit line. current flows out of the capacitor and discharges it. creating a “logic 0.

and this current is sensed and amplified. If the capacitor is charged. so that the sensing element determines that a logic 0 is present. To “read. the word line once again has a voltage applied to it. Dodge 9/15 . showing that a “1” is present. no current flows. B.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas DRAM Cell Operation (2) Bit line logic “1” sensed Bit line No current flow Current +V (= logic “1”) turns on transistor Word line CMOS transistor + Capacitor charged +V (= logic “1”) turns on transistor CMOS transistor Word line 0 Ground Read 1 memory cycle. If the capacitor is discharged. which turns on the transistor.” or sense the value of the DRAM cell. current flows OUT of the transistor. logic “0” sensed Capacitor has no charge Ground Read 0 memory cycle. 22 Lecture # 21: Memory Management in Modern Computers © N.

the capacitor loses charge so that the logic “1” eventually disappears. whether used or not. which does not reduce memory speed or efficiency to any great degree. B. • Also. Dodge 9/15 . • We see that even if a 1 is not read. taking 4-5% of total memory read/write time. however. 23 Lecture # 21: Memory Management in Modern Computers © N.” it must be rewritten. as time passes. • Therefore. the charge must be periodically replaced or the DRAM memory “loses its mind!” • In a modern DRAM cell. • The refresh cycle is not long.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science DRAM Cell Operation (3) • Note that in reading a DRAM memory cell with a “1” in it (charge stored on capacitor). the act of reading destroys the “1” by draining the charge off the capacitor. this “refresh” must occur every few milliseconds. after reading a “1.

B. • The refresh cycle occurs after a logic 1 read or periodically if the memory cell is not accessed. Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science DRAM Cell Operation (4) Bit line Logic “1” read (or sensed in refresh cycle) by draining capacitor Current Word line activated Word line CMOS transistor +→0 Capacitor discharges Bit line Current Word line reactivated CMOS transistor Word line 0→+ Capacitor recharged Ground Read 1 memory cycle or refresh cycle logic “1” detect. Logic “1” rewritten by applying +V to bit line Ground Read or refresh cycle logic “1” rewrite. 24 Lecture # 21: Memory Management in Modern Computers © N. The refresh cycle is typically every few milliseconds. it is not recharged. Obviously if the cell is a 0.

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science DRAM’s Get Denser Current new DRAM’s top the scale at 64 and 128 gigabits! 25 Lecture # 21: Memory Management in Modern Computers © N. Dodge 9/15 . B.

2. not accessed by its addressing mechanism for either read or write) for several milliseconds. A DRAM memory chip is accessed and a bit read out.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Exercise 1 1. What happens now? 3. registers. Rank these memories by speed: L2 cache. L1 cache. The bit that is read is a 1.. Dodge 9/15 . That same memory bit is then left “alone” (i. B. and hard disk drives. What happens next? 26 Lecture # 21: Memory Management in Modern Computers © N.e. DRAM.

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Bulk Storage (Disk Storage or HDD) • Electromechanical data storage is normally not random-access like SRAM or DRAM. which generally have to do with positioning a recording mechanism over the correct location in an expanse of recording media prior to being able to perform the memory access. but must be loaded or stored according to rules. because it involves mechanical movement rather than simply electronic switching. the correct segment of data must be located (normally by mechanically moving a recording head) before it can be read. B. Dodge 9/15 . • This means that data cannot normally be accessed in arbitrary order. 28 Lecture # 21: Memory Management in Modern Computers © N. • That is. • This load/store operation is particularly time-consuming.

which magnetizes material in the HDD surface. 29 Lecture # 21: Memory Management in Modern Computers © N. • A magnetic coil is used to record each one and zero. the data is detected. the other a 0.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science HDD Read/Write Mechanism Strong. Dodge 9/15 . One direction of current writes a 1. the oppositepolarity 1’s and 0’s cause back-and-forth current flow according to whether a 1 or 0 is present. B. concentrated magnetic field Rotation Aluminum disk coated with magnetic material Magnetic field lines (direction depends on current flow) Current flow • The HDD stores data on a rotating disk coated with magnetic material. Current in the coil generates a magnetic field. In this way. • When the coil is later positioned over the disk to read.

Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Hard Disk Drive Example Portion of read/write electronic circuitry (the rest is on the back side of the unit on a separate circuit board). B. 30 Metal disk covered with magnetic coating Recording head Lecture # 21: Memory Management in Modern Computers © N.

B. Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Detail of Disk Read/Write Head Flexible cable carries signals to amplifier circuitry to be converted to digital signals Positioning mechanism Positioning arm Recording head 31 Lecture # 21: Memory Management in Modern Computers © N.

Second recording disk surface (recording head not visible) 32 Lecture # 21: Memory Management in Modern Computers © N. note that positioning mechanism moves both heads simultaneously.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science HDD Side View. Showing Multiple Disk Platters Upper recording and reading head. B. Dodge 9/15 .

which provides stability and better data integrity. or similar rigid container. B.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science HDD Package The HDD is usually packaged in a metal case. Higher-quality units are typically packaged in an aluminum casting. Dodge 9/15 . 33 Lecture # 21: Memory Management in Modern Computers © N.

– Rotational time = time for the requested sector to rotate underneath the read/write head after the head is positioned over the track. – Controller delay = time to set up transfer in the HDD electronic interface. 34 Lecture # 21: Memory Management in Modern Computers © N. – Transfer time = time for data transfer from disk to main memory.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science HDD Storage/Retrieval is Slow • The latency (“time to get/store data”) of a HDD is given by the formula: latency = seek time + rotational delay + transfer time + controller delay Where: – Seek time = time for the positioning arm to move the head from its present track to the track where the load/store data is located. Dodge 9/15 . B.

3 ms. Note that actual transfer time is small! 35 Lecture # 21: Memory Management in Modern Computers © N.2ms + 0. transfer time = [0. B. Dodge 9/15 .1 ms – Controller delay = 2 ms – Rotational time depends on the position of the first byte to be transferred.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science HDD Storage/Retrieval is Slow (2) • Example: latency of writing one 512-byte sector on a magnetic disk rotating at 7200 rpm.1 + 2 ms = 18. with the following parameters: – Average seek time = 12 ms (typical for movement across half the disk) – Transfer rate = 5 Mbytes/sec.000512Mbyte/5 Mbyte/sec] = 0.2 ms (average rotation = ½ of circle). Then average latency = 12 ms + 4. but on average will be ([1/7200)]×60×[1/2]) = 4.

DVD’s and “thumb drives. programmable read-only memory”). They are relatively slow. Floppy disks.” • Very fast EPROM’s (SSD’s) are beginning to be available for fast bulk storage. and tapes are magnetic media. • The “thumb drive” uses electronic memory called EPROM (“erasable. Dodge 9/15 . and is a true solidstate memory. These drives were the first “SSD’s. The CD-ROM and the DVD use optical recording/reading involving a laser beam to record and read data. Zip drives. B. 36 Lecture # 21: Memory Management in Modern Computers © N. hard drives. They are relatively expensive.” Most of these storage units are removable media.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Other Disk Storage Units and Media • • • • Other storage media include CD’s. replacing HDD’s on laptops.

Dodge 9/15 . Samsung launched the SSD 850 EVO with storage up to 1 TB and the reviews have been very positive. with a slightly faster read speed and best-in-class 10 year Lecture # 21: Memory Management in Modern Computers © N. at 960GB. The best budget option is the SanDisk Ultra II. Samsung also introduced a 1 TB 850 Pro.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science SSD’s Get Larger and Faster • • • 37 Last year. B. with lower performance and warranty.

warranty – a decent budget option) 38 Lecture # 21: Memory Management in Modern Computers © N. SanDisk Ultra II ($350-400): 960 GB SSD (3 yr. Crucial M550 ($400-450): 1 TB SSD (3-year warranty) 6. Samsung 850 Pro ($600-650): 1 TB SSD (10-year warranty) 4. 2.000). Samsung 850 EVO SSD ($400-450): 1 TB SSD (5-year warranty) Priced for the home user! 5. 3. OCZ Chiron: 4 TB SSD (may be expandable past 4 TB). also $15-30K. B. Dodge 9/15 . LSI: 4 TB SSD ($29.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Biggest and Best SSD’s The biggest solid state drives available in 2015: 1.

DRAM. • All of these (other than archival types) are used in a mix on the modern computer for real-time storage and retrieval of data. 39 Lecture # 21: Memory Management in Modern Computers © N. and HDDs make up the “memory hierarchy” of most computers. • The trick is to design a mix of these types which will give the highest performance for a reasonable price. Dodge 9/15 . a mix of L1 and L2 cache. B.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science The Memory Hierarchy • We have described a number of memory devices which are useful for storing and reading computer data. • Since SRAMs – the best data storage media if not so power-hungry and costly – cannot be used exclusively.

electronic signals propagate at about 33 ps/cm. a loop program continues to use the same steps). since memory access also depends on the proximity of the storage element. • Higher-speed memories are also placed closer to the CPU. – Spatial locality – Recently-accessed data items are usually close to other recently-accessed (or about-to-be-accessed) data items.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Arrangement of the Memory Hierarchy • Memory arrangements make use of the fact that programs exhibit two common behaviors: – Temporal locality – Recently-used code and data is often reused (e.g. B.. 40 Lecture # 21: Memory Management in Modern Computers © N. • Modern schemes use a “shuffling” methodology that moves data from slower storage media to faster media. Dodge 9/15 .

5-4 Gbytes 0.5 ns 1-10 ns 160-2000 Gbytes ~10-20 ms • Memory is physically arranged so that fastest elements (registers) are closest to the CPU and slower elements are progressively farther away. Dodge 9/15 .5-3 Mbytes 0. B.2-0.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Arrangement of Levels in Memory Hierarchy CPU Package CPU L1 Cache L2/3 Cache DRAM HDD Registers Size: <300 Bytes Speed: 100 ps 8-64 Kbytes 200 ps 0. 41 Lecture # 21: Memory Management in Modern Computers © N.

The secret of today’s high-performance PC’s and workstations is the design of an architecture that allows maximum use of DRAM and HDD (cheap) plus just enough SRAM cache (expensive and power-consuming). 42 Lecture # 21: Memory Management in Modern Computers © N. Dodge 9/15 . • The key is the use of cache. complex arrangement to constantly move program and data content from slower to faster memory as the CPU executes a process. This method uses a very high speed. B. the key to modern computer performance is not the CPU – CPU performance has far-outstripped the speed of most computer memories. thus enabling the CPU to realize most of its performance advantage.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science The Importance of Cache • As mentioned previously. • The method used is the “shuffling” technique alluded to two slides back.

• This hardware has two special goals: (1) examining the currentlyexecuting process and predicting instruction and data need. B. and (2) moving the required information from DRAM to cache in a timely manner to foresee that anticipated need. Dodge 9/15 . 43 Lecture # 21: Memory Management in Modern Computers © N. • Special hardware is designed to manage cache content with the goal of forecasting upcoming instructions and data required by the processor during program execution and moving it from slower DRAM into cache.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Cache Utilization • Cache designers make use of the principles of temporal and spatial locality to assure that the most-probably needed instructions and data are available to the computer in cache (to speed execution).

The simplest is direct mapping. it goes to the corresponding cache location to get the data. Lecture # 21: Memory Management in Modern Computers © N. and (2) if it is NOT there. how does the processor get it and what sort of performance penalty is there? There are several ways in which the cache can be assigned DRAM memory correspondence. B. When a program needs a particular DRAM location to be loaded. This is because since each cache block is assigned to several memory blocks in DRAM.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Looking for Data/Instructions in the Cache • • • • 44 Clearly the purpose of cache management is to make sure that ALL upcoming instructions and data are in the cache. This leads to further complications. in that now we need “validity indicators” for each cache location. Dodge 9/15 . in which each block of memory in cache is assigned to some number of DRAM locations. the program needs to know if the right data is available in cache at the time it is needed. This brings up two questions: (1) how does the processor know that data is in the cache.

45 Lecture # 21: Memory Management in Modern Computers © N. • There are a number of clever and effective cache management designs. since it can substantially slow down the program. then. is to minimize the cache misses. which dramatically reduce cache misses and improve computer performance. the hardware memory manager declares a “cache miss. They are. • A key part of cache memory management.” This means that the program must be delayed for several clock cycles while the required instruction or data is moved from DRAM to cache. Dodge 9/15 . B. which correspondingly increases the speed of execution of a program. however.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Looking for Data (2) • If the correct data is not in cache. • We see that a cache miss is highly undesirable. beyond the scope of EE 2310.

B. A “cache miss” will initiate DRAM access for transfer to cache. 46 Lecture # 21: Memory Management in Modern Computers © N.Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Cache Diagram CPU Package CPU L1 Cache L2 Cache DRAM HDD Registers Cache management hardware includes subsystems to predict usage and move data or instructions from DRAM to cache as appropriate. Dodge 9/15 .

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Summary • Modern memory management maximizes the speed of computer processing while keeping system cost reasonable for the user. as the main “working memory. • Effective. Dodge 9/15 . B. a substantial amount of DRAM. 47 Lecture # 21: Memory Management in Modern Computers © N.” and HDD or flash memory for large program storage. which is still very fast. • This approach uses a small amount of very fast SRAM cache memory which are physically near the computer. (but complex) hardware and software have been developed to manage this memory hierarchy and maximize its effectiveness.

Dodge 9/15 . the correct instruction is NOT in cache. Assume that. is supposed to reside in a given Mbyte of DRAM. Each small area of cache (say. 1K byte) represents a much larger area (say. B. according to the validity indicator.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Exercise 2 1. 1 Mbyte) in DRAM. Give simple definitions of the principles of temporal and spatial locality. 48 Lecture # 21: Memory Management in Modern Computers © N. the corresponding cache extent is searched. What now? 2. If an instruction. for example.

• Now let’s talk about what’s happening today and in the near future.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Computing in the Future • We discussed the evolution of computing up to the present in Lecture #1. B. The University of Texas. and various other sources. 50 Lecture # 21: Memory Management in Modern Computers © N. Dodge 9/15 . • Information shown here is from Intel.

. They could eventually replace flash EPROM. leading to neuraltype processors in the long term. memory) even when power is off. Thousand Gbyte main memories are possible. • Memristor circuit elements can retain a state (i.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Memory • In 2008.e. the memristor. a completely new kind of circuit element (that was predicted in 1971). Thus a memristor memory might eventually “remember” like a human neuron. 51 Lecture # 21: Memory Management in Modern Computers © N. was developed. • Memristors remember multiple states (not just ones and zeros). and perhaps DRAM. Dodge 9/15 . B.

but large memristor memories are possible by 2020 or a bit later. Lecture # 21: Memory Management in Modern Computers © N. • The claim is that such drives could be packaged in one “blade box” for a total of 24 Petabytes! • There is no specific product roadmap but HP’s CTO has talked confidently of HP popping 100TB Memristor drives into StoreServ arrays in five years. Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Memristors: When? • HP says that it will have 100 TByte memristor drives by 2018. Expect these production dates to be late. B. 52 A memristor production slice.

the device produces an amorphous crystal (disorganized structure). creating an organized crystal structure with low conductivity. 53 Lecture # 21: Memory Management in Modern Computers © N. This is a 1. DRAM and flash memory can do 1-10 quadrillion cycles!).The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Memory (3) • Another new memory type is Phase Change Memory (“PCM”). Dodge 9/15 . Can be read/written about 100 million times—far too low. – Still experimental. the material is heated to a lower temperature. – Heated to a high temperature. – To write a 0. with high conductivity. B.

Lecture # 21: Memory Management in Modern Computers © N. low power.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Memory (4) • Other new memory types: – Magnetic RAM (MRAM) – Uses tunneling resistance that depends on the relative magnetization of ferromagnetic electrodes. B. Production cost and reliability are problems. high density. Nonvolatile. • 54 Only time will tell if these will challenge DRAM. – Resistive RAM (ReRAM) – Varies resistance according to applied voltage. Early work. Dodge 9/15 .

• With the new “Broadwell” architecture at the 14 nm node. Intel stated a goal of a 10 GHz CPU by 2010—which clearly didn’t happen. 55 Lecture # 21: Memory Management in Modern Computers © N. B. Dodge 9/15 . The standard is now 4.core CPU’s. Intel is now said to be planning an 18-core CPU. Intel Xeon server CPU’s are 8-core.and 6. with 8.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science The CPU • Intel and AMD abandoned the GHz race in CPU’s years ago. • Multiple CPU’s became the performance enhancer. In the early 2000’s. Speed is inching up—maybe to 5 GHz in the late teens.and 10-core performance CPU’s as well.

The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

Minimum Feature Size
• “Minimum feature size:” the smallest dimension that
can be laid out on a chip (typically the gate).
• Currently the minimum feature size for DRAM
memory is 28-23 nanometers. (one nanometer is one
billionth of a meter in length [10–9 meters]). Memory
size is forecast to go further down:
DRAM: 2015 2017 2019 2021 2023 2024 2025 2026
23 17.9 14.2 11.3 8.9 8.0 7.1 6.3

• Currently the minimum feature size for DRAM
memory is 28-23 nanometers.
• Note how far behind CPU chipmakers the memory
technology lags!
56

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

CPU Technology Bounds Ahead
• Intel and AMD currently
manufacture CPU’s at the 22
nanometer node, and Intel is
sampling 14 nm products this
year. Intel will also start to
produce “Skylake” chips, the
successor to Broadwell.
• Clearly, CPU chipmakers lead
the way in minimum feature
size.

AMD Laptop CPU Chip
57

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

The University of Texas at Dallas

Erik Jonsson School of Engineering and
Computer Science

“Three-D” Technology
• As feature size gets
smaller (22 nm→14 nm),
3-D manufacturing
processes continue to
improve.
• Generically referred to as
“FINFET” due to the 3-d
“fin.” Intel calls it “trigate.”
• At right, comparison of
transistor sizes in 22 and
14 nm processes.
58

Lecture # 21: Memory Management in Modern Computers

© N. B. Dodge 9/15

With a promise of 3 Intel Knights Landing. plus up to 384GB of DDR4-2400 mainboard memory. Dodge 9/15 . B. teraflops (double precision) with 72 Pentium per socket it will almost cores with 64-bit certainly be used to build some support. KL will use the 14nm process. “Knights Landing. monster x86 supercomputers. up to 500GB/sec of memory bandwidth. it will have up to 16GB DRAM.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Multi-Core Advance • Intel has announced a 72-core CPU. 59 Lecture # 21: Memory Management in Modern Computers © N.” • Available in 2015.

60 Lecture # 21: Memory Management in Modern Computers © N. B. • Has 14 Petabytes of storage memory. that TERA bytes) of DRAM.600. Dodge 9/15 .080 cores.000. Has 522.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science “Stampede” • New supercomputer at the University of Texas. • Developers claim that exaflops are on the way.000. a precursor to Knights Landing. • Knights Corner also uses modified Pentium-era cores. each with dual 8-core Intel Xeon processors.000 floating point operations per second (9. • Peak performance = 9. • Uses several thousand Dell “Zeus” servers.6 petaflops).000. • Each Zeus server uses several Knights Corner chips. • Has 270 Tbytes (yes.

the Intel CPU family has gone from “Sandy Bridge” (32 nm) →“Ivy Bridge” (22 nm. 22 nm) → “Broadwell” (14 nm). have better graphics.” © N. (current-generation Intel CPU’s) are all 22 nm. – Both Broadwell and Skylake will use much less power.” “Snow Leopard. 61 * Intel has odd CPU Family names.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Intel: Sandy Bridge → Skylake • What about the “bread and butter” Intel PC CPU’s? – In the last few years. lower power. Dodge 9/15 Lecture # 21: Memory Management in Modern Computers . “FinFET”) → “Haswell” (Optimized FinFET. especially in the mobile-computing variants. Broadwell/ Skylake will be faster. The next generation is termed “Skylake” (14 nm. Broadwell 14-nm chips are just appearing. B. new architecture.” “Lion. optimized FinFET”). • Haswell. like Apple OS X updates – “Leopard.

2010) Tock Sandy Bridge (January. Dodge 9/15 . 2012) Tock Haswell (June. 2013) Tick Broadwell (October. – The “Tick” phase is shrinking the minimum feature size. Thus: 62 Tick Westmere (January.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science “Tick-Tock” • Intel is said to follow a “Tick-Tock” design strategy. 2011) Tick Ivy Bridge (April. 2014) Tock Skylake (Mid-2015) Tick Cannonlake (2016? – 11 nm?) Lecture # 21: Memory Management in Modern Computers © N. B. – “Tock” is introducing a new microarchitecture.

The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science The End of Graphics Processors? • For high-performance graphics (e. put GPU’s on the chip! Interestingly enough. it appears that graphics processors are included as CPU cores.g. Many of the cores may be used for graphics generation. Intel and nVidia signed a cross-licensing agreement in 2012. • With many-core chips. video games). B. PC gamers add one or more high-performance video cards. Dodge 9/15 . that may not be necessary. • With the new Skylake family. • Better still. Graphics cards may become unnecessary! 63 Lecture # 21: Memory Management in Modern Computers © N. giving Intel access to nVidia’s GPU designs..

• The only theoretical limits to the size of a switching element are those related to the size of atoms and electronic wells! 64 Lecture # 21: Memory Management in Modern Computers © N. • This means that we are probably coming to the end of the “Moore’s Law era. • Some experts speculate that current production methods will not be able to sustain CPU manufacturing below about 7 nm. But the current photographic (or shadow-mask) manufacturing techniques will have to give way to other methods. • Will circuits continue to shrink in size? • Undoubtedly. Dodge 9/15 . reducing minimum feature size) are getting harder and harder to reach. and costs reduced by about half per transistor.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science The End of Moore’s Law? • Smaller nodes (that is.” where the number of circuits per chip doubled every 18 months. B.

you don’t have to use them.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Windows 10—Anything New? • Microsoft continues to lose ground in the mobile sector. • Users still have the option of “Metro-style live tiles. – W10 appears to be a combination of Windows 8 and Windows 7. B. Now you can have as many as you want. • The “Start” screen is back. 65 Lecture # 21: Memory Management in Modern Computers © N. • Live tiles can be run as a window (app-like) on the desktop. Unlike Linux.” • Virtual desktops. Windows has always provided only one desktop. Dodge 9/15 . Alternatively. Everybody hated W8’s desktop.

• An OS that can be more easily updated (a la Apple) may be next. • Other rumors state that W10 is the last Windows. Dodge 9/15 . • This probably will not remain true for long. B. 66 Lecture # 21: Memory Management in Modern Computers © N. It will be interesting to see Microsoft OS. W10 may be offered at a steep discount to Windows XP users to get them to ditch their 13-year-old operating system. However. TNG (the next generation) and its features. Maybe even a similar carrot for Vista users.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Windows (Continued) • Windows 10 has been free to early adapters via download.

Sixty-five inch TV’s came out recently—75 and up are coming. which has a much sharper display than even LED-backlit LCD displays. B. • Samsung now offers a 105inch. 4K. • The other advance is OLED.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Other Computer-Based State-of-the-Art Electronics • New 4K TV’s are already out—four times the resolution of HD. Dodge 9/15 . 3D LED TV with a fully curved screen! 67 Samsung 105” 4K TV Lecture # 21: Memory Management in Modern Computers © N.

Dodge 9/15 . B. 68 Lecture # 21: Memory Management in Modern Computers © N.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science 3-D Printing • The development of 3D printers over the last few years has been astonishing.

B.The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science Just About ANYTHING Can Be Printed! 69 Lecture # 21: Memory Management in Modern Computers © N. Dodge 9/15 .

70 Lecture # 21: Memory Management in Modern Computers © N. • Above: Two 3-D printers in the act of printing “biomatrices” on which organ cells can be attached to grow an organ. B. Dodge 9/15 .The University of Texas at Dallas Erik Jonsson School of Engineering and Computer Science 3-D Printing of Human Organs (!) • 3D printers are being used more and more in bioengineering.