You are on page 1of 5

How Caching Works

If you have been shopping for a computer, then you have heard the word "cache." Modern computers have both L and L! caches. "ou may a#so have gotten advice on the topic from we##$meaning friends, perhaps something #ike "%on&t buy that Ce#eron chip, it doesn&t have any cache in it'" It turns out that caching is an important computer$science process that appears on every computer in a variety of forms. (here are memory caches, hardware and software disk caches, page caches and more. )irtua# memory is even a form of caching. In this artic#e, we wi## e*p#ore caching so you can understand why it is so important.

+ ,imp#e -*amp#e. /efore Cache


Caching is a techno#ogy based on the memory subsystem of your computer. (he main purpose of a cache is to acce#erate your computer whi#e keeping the price of the computer #ow. Caching a##ows you to do your computer tasks more rapid#y. (o understand the basic idea behind a cache system, #et&s start with a super$simp#e e*amp#e that uses a #ibrarian to demonstrate caching concepts. Let&s imagine a #ibrarian behind his desk. He is there to give you the books you ask for. 0or the sake of simp#icity, #et&s say you can&t get the books yourse#f $$ you have to ask the #ibrarian for any book you want to read, and he fetches it for you from a set of stacks in a storeroom 1the #ibrary of congress in Washington, %.C., is set up this way2. 0irst, #et&s start with a #ibrarian without cache. (he first customer arrives. He asks for the book Moby Dick. (he #ibrarian goes into the storeroom, gets the book, returns to the counter and gives the book to the customer. Later, the c#ient comes back to return the book. (he #ibrarian takes the book and returns it to the storeroom. He then returns to his counter waiting for another customer. Let&s say the ne*t customer asks for Moby Dick 1you saw it coming...2. (he #ibrarian then has to return to the storeroom to get the book he recent#y hand#ed and give it to the c#ient. 3nder this mode#, the #ibrarian has to make a comp#ete round trip to fetch every book $$ even very popu#ar ones that are re4uested fre4uent#y. Is there a way to improve the performance of the #ibrarian5

"es, there&s a way $$ we can put a cache on the #ibrarian. In the ne*t section, we&## #ook at this same e*amp#e but this time, the #ibrarian wi## use a caching system.

+ ,imp#e -*amp#e. +fter Cache


Let&s give the #ibrarian a backpack into which he wi## be ab#e to store 6 books 1in computer terms, the #ibrarian now has a 6$book cache2. In this backpack, he wi## put the books the c#ients return to him, up to a ma*imum of 6. Let&s use the prior e*amp#e, but now with our new$and$improved caching #ibrarian. (he day starts. (he backpack of the #ibrarian is empty. 7ur first c#ient arrives and asks for Moby Dick. 8o magic here $$ the #ibrarian has to go to the storeroom to get the book. He gives it to the c#ient. Later, the c#ient returns and gives the book back to the #ibrarian. Instead of returning to the storeroom to return the book, the #ibrarian puts the book in his backpack and stands there 1he checks first to see if the bag is fu## $$ more on that #ater2. +nother c#ient arrives and asks for Moby Dick. /efore going to the storeroom, the #ibrarian checks to see if this tit#e is in his backpack. He finds it' +## he has to do is take the book from the backpack and give it to the c#ient. (here&s no 9ourney into the storeroom, so the c#ient is served more efficient#y. What if the c#ient asked for a tit#e not in the cache 1the backpack25 In this case, the #ibrarian is #ess efficient with a cache than without one, because the #ibrarian takes the time to #ook for the book in his backpack first. 7ne of the cha##enges of cache design is to minimi:e the impact of cache searches, and modern hardware has reduced this time de#ay to practica##y :ero. -ven in our simp#e #ibrarian e*amp#e, the #atency time 1the waiting time2 of searching the cache is so sma## compared to the time to wa#k back to the storeroom that it is irre#evant. (he cache is sma## 1 6 books2, and the time it takes to notice a miss is on#y a tiny fraction of the time that a 9ourney to the storeroom takes. 0rom this e*amp#e you can see severa# important facts about caching.

Cache techno#ogy is the use of a faster but sma##er memory type to acce#erate a s#ower but #arger memory type. When using a cache, you must check the cache to see if an item is in there. If it is there, it&s ca##ed a cache hit. If not, it is ca##ed a cache miss and the computer must wait for a round trip from the #arger, s#ower memory area. + cache has some ma*imum si:e that is much sma##er than the #arger storage area. It is possib#e to have mu#tip#e #ayers of cache. With our #ibrarian e*amp#e, the sma##er but faster memory type is the backpack, and the storeroom represents the #arger and s#ower memory type. (his is a one$#eve# cache. (here might be another #ayer of cache consisting of a she#f that can ho#d 66 books behind the counter. (he #ibrarian can check the backpack, then the she#f and then the storeroom. (his wou#d be a two$ #eve# cache.

Computer Caches
+ computer is a machine in which we measure time in very sma## increments. When the microprocessor accesses the main memory 1;+M2, it does it in about <6 nanoseconds 1<6 bi##ionths of a second2. (hat&s pretty fast, but it is much s#ower than the typica# microprocessor. Microprocessors can have cyc#e times as short as ! nanoseconds, so to a microprocessor <6 nanoseconds seems #ike an eternity.

What if we bui#d a specia# memory bank in the motherboard, sma## but very fast 1around =6 nanoseconds25 (hat&s a#ready two times faster than the main memory access. (hat&s ca##ed a #eve# ! cache or an L2 cache. What if we bui#d an even sma##er but faster memory system direct#y into the microprocessor&s chip5 (hat way, this memory wi## be accessed at the speed of the microprocessor and not the speed of the memory bus. (hat&s an L1 cache, which on a !==$megahert: 1MH:2 >entium is =.? times faster than the L! cache, which is two times faster than the access to main memory. ,ome microprocessors have two #eve#s of cache bui#t right into the chip. In this case, the motherboard cache $$ the cache that e*ists between the microprocessor and main system memory $$ becomes #eve# =, or L3 cache. (here are a #ot of subsystems in a computer@ you can put cache between many of them to improve performance. Here&s an e*amp#e. We have the microprocessor 1the fastest thing in the computer2. (hen there&s the L cache that caches the L! cache that caches the main memory which can be used 1and is often used2 as a cache for even s#ower periphera#s #ike hard disks and C%$;7Ms. (he hard disks are a#so used to cache an even s#ower medium $$ your Internet connection.

Caching ,ubsystems
"our Internet connection is the s#owest #ink in your computer. ,o your browser 1Internet -*p#orer, 8etscape, 7pera, etc.2 uses the hard disk to store H(ML pages, putting them into a specia# fo#der on your disk. (he first time you ask for an H(ML page, your browser renders it and a copy of it is a#so stored on your disk. (he ne*t time you re4uest access to this page, your browser checks if the date of the fi#e on the Internet is newer than the one cached. If the date is the same, your browser uses the one on your hard disk instead of down#oading it from Internet. In this case, the sma##er but faster memory system is your hard disk and the #arger and s#ower one is the Internet. Cache can a#so be bui#t direct#y on peripherals. Modern hard disks come with fast memory, around ? ! ki#obytes, hardwired to the hard disk. (he computer doesn&t direct#y use this memory $$ the hard$disk contro##er does. 0or the computer, these memory chips are the disk itse#f. When the computer asks for data from the hard disk, the hard$disk contro##er checks into this memory before moving the mechanica# parts of the hard disk 1which is very s#ow compared to memory2. If it finds the data that the computer asked for in the cache, it wi## return the data stored in the cache without actua##y accessing data on the disk itse#f, saving a #ot of time. Here&s an e*periment you can try. "our computer caches your f#oppy drive with main memory, and you can actua##y see it happening. +ccess a #arge fi#e from your f#oppy $$ for e*amp#e, open a =66$ki#obyte te*t fi#e in a te*t editor. (he first time, you wi## see the #ight on your f#oppy turning on, and you wi## wait. (he f#oppy disk is e*treme#y s#ow, so it wi## take !6 seconds to #oad the fi#e. 8ow, c#ose the editor and open the same fi#e again. (he second time 1don&t wait =6 minutes or do a #ot of disk access between the two tries2 you won&t see the #ight turning on, and you won&t wait. (he operating system checked into its memory cache for the f#oppy disk and found what it was #ooking for. ,o instead of waiting !6 seconds, the data was found in a memory subsystem much faster than when you first tried it 1one access to the f#oppy disk takes !6 mi##iseconds, whi#e one access to the main memory takes around <6 nanoseconds $$ that&s a #ot faster2. "ou cou#d have run the same test on your hard disk, but it&s more evident on the f#oppy drive because it&s so s#ow.

(o give you the big picture of it a##, here&s a #ist of a norma# caching system.

L1 cache $ Memory accesses at fu## microprocessor speed 1 6 nanoseconds, A ki#obytes to < ki#obytes in si:e2 L2 cache $ Memory access of type ,;+M 1around !6 to =6 nanoseconds, !B ki#obytes to ? ! ki#obytes in si:e2 Main memory $ Memory access of type ;+M 1around <6 nanoseconds, =! megabytes to !B megabytes in si:e2 Hard disk $ Mechanica#, s#ow 1around ! mi##iseconds, gigabyte to 6 gigabytes in si:e2 Internet $ Incredib#y s#ow 1between second and = days, un#imited si:e2

+s you can see, the L cache caches the L! cache, which caches the main memory, which can be used to cache the disk subsystems, and so on.

Cache (echno#ogy
7ne common 4uestion asked at this point is, "Why not make a## of the computer&s memory run at the same speed as the L cache, so no caching wou#d be re4uired5" (hat wou#d work, but it wou#d be incredib#y e*pensive. (he idea behind caching is to use a sma## amount of e*pensive memory to speed up a #arge amount of s#ower, #ess$e*pensive memory. In designing a computer, the goa# is to a##ow the microprocessor to run at its fu## speed as ine*pensive#y as possib#e. + ?66$MH: chip goes through ?66 mi##ion cyc#es in one second 1one cyc#e every two nanoseconds2. Without L and L! caches, an access to the main memory takes <6 nanoseconds, or about =6 wasted cyc#es accessing memory. When you think about it, it is kind of incredib#e that such re#ative#y tiny amounts of memory can ma*imi:e the use of much #arger amounts of memory. (hink about a !?<$ki#obyte L! cache that caches <A megabytes of ;+M. In this case, !?<,666 bytes efficient#y caches <A,666,666 bytes. Why does that work5 In computer science, we have a theoretica# concept ca##ed locality of reference. It means that in a fair#y #arge program, on#y sma## portions are ever used at any one time. +s strange as it may seem, #oca#ity of reference works for the huge ma9ority of programs. -ven if the e*ecutab#e is 6 megabytes in si:e, on#y a handfu# of bytes from that program are in use at any one time, and their rate of repetition is very high. 7n the ne*t page, you&## #earn more about #oca#ity of reference.

Loca#ity of ;eference
Let&s take a #ook at the fo##owing pseudo$code to see why #oca#ity of reference works 1see How C >rogramming Works to rea##y get into it2. 7utput to screen C -nter a number between ;ead input from user >ut va#ue from user in variab#e E >ut va#ue 66 in variab#e " >ut va#ue in variab#e F Loop " number of time %ivide F by E If the remainder of the division G 6 and 66 D

then output C F is a mu#tip#e of E D +dd to F ;eturn to #oop -nd (his sma## program asks the user to enter a number between and 66. It reads the va#ue entered by the user. (hen, the program divides every number between and 66 by the number entered by the user. It checks if the remainder is 6 1modu#o division2. If so, the program outputs "F is a mu#tip#e of E" 1for e*amp#e, ! is a mu#tip#e of <2, for every number between and 66. (hen the program ends. -ven if you don&t know much about computer programming, it is easy to understand that in the #ines of this program, the loop part 1#ines H to I2 are e*ecuted 66 times. +## of the other #ines are e*ecuted on#y once. Lines H to I wi## run significant#y faster because of caching. (his program is very sma## and can easi#y fit entire#y in the sma##est of L caches, but #et&s say this program is huge. (he resu#t remains the same. When you program, a #ot of action takes p#ace inside #oops. + word processor spends I? percent of the time waiting for your input and disp#aying it on the screen. (his part of the word$processor program is in the cache. (his I?J$to$?J ratio 1appro*imate#y2 is what we ca## the #oca#ity of reference, and it&s why a cache works so efficient#y. (his is a#so why such a sma## cache can efficient#y cache such a #arge memory system. "ou can see why it&s not worth it to construct a computer with the fastest memory everywhere. We can de#iver I? percent of this effectiveness for a fraction of the cost.