Cacche

Memory Hierarchy
• Goal: Illusion of large, fast,

cheap memory
Cache Memory • Solution: Smaller, faster

By Suparna Dutta cache memory was put
between CPU and DRAM.
Memory Hierarchy Memory Hierarchy Properties

• In the Computer System
Design, Memory Hierarchy
is an enhancement to
organize the memory such
that it can minimize the
access time. The Memory
Hierarchy was developed
based on a program
behavior known as locality
of references.
Cache Performance Types of misses
• A cache miss occurs when a cache doesn't have the requested data in Compulsory—The very first access to a block cannot be in the cache, so the
its memory. Meanwhile, a cache hit is when a cache successfully finds block must be brought into the cache. Compulsory misses are those that
occur
the requested data, satisfying the search query.
even if you had an infinite sized cache.
• The performance of cache memory is frequently measured in terms Capacity—If the cache cannot contain all the blocks needed during execution
of a quantity called Hit ratio. of a program, capacity misses (in addition to compulsory misses) will occur
Hit ratio(H) = hit / (hit + miss) = because of blocks being discarded and later retrieved.
no. of hits/total accesses Conflict—If the block placement strategy is not fully associative, conflict
Miss ratio = miss / (hit + miss) = misses (in addition to compulsory and capacity misses) will occur because a
no. of miss/total accesses = 1 - hit
ratio(H)
block may be discarded and later retrieved if multiple blocks map to its set
and accesses to the different blocks are intermingled.
Cache Effective Access Time Cache Mapping

• EAT= h* (Cache Access Time)+ (1-h)* Memory Access Time
Where h=hit ratio
For multi level cache:
Tavg = H1 * C1 + (1 – H1) * (H2 * C2 +(1 – H2) *M )
H1 is the Hit rate in the L1 caches.
H2 is the Hit rate in the L2 cache.
C1 is the Time to access information in the L1 caches.
C2 is the Miss penalty to transfer information from the L2 cache to an L1
cache.
M is the Miss penalty to transfer information from the main memory to the
L2 cache.
Direct Mapping Fully Associative Mapping
• The simplest technique, known • A block of main
as direct mapping, maps each memory can map to
block of main memory into only any line of the cache
one possible cache line. or In that is freely available
Direct mapping, assign each at that moment.
memory block to a specific line • This makes fully
in the cache. If a line is associative mapping
previously taken up by a more flexible than
memory block when a new direct mapping.
block needs to be loaded, the
old block is trashed.
K- Way Set Associative Mapping Numericals

A computer has an 8 GB memory with 64 bit word sizes. Each block of memory
In k-way set associative mapping, stores 16 words. The computer has a direct-mapped cache of 128 blocks. The
computer uses word level addressing. What is the address format? If we change the
• Cache lines are grouped into sets cache to a 4- way set associative cache, what is the new address format?
where each set contains k number of Answers: With 8 GB and a 64 bit word size, there are 8 GB / (8 bytes / word) = 1 GW
lines. of memory. This requires 30 bits for an address. Of the 30 bits, we need 4 bits for
the word on the line and 7 bits for the block number, leaving 30 – (7 + 4) = 19 bits
• A particular block of main memory can for the tag.
map to only one particular set of the So the address format is 19 – 7 – 4.
cache.
If we have a 4-way set associative cache instead each set has 4 lines, then there will
• However, within that set, the memory be 128 / 4 = 32 sets. So we would only need 5 bits for the block number, leaving 30
block can map any cache line that is – (5 + 4) = 21 bits for the tag.
freely available like fully associative So the address format is 21 – 5 – 4.
mapping.
Techniques to reduce misses Techniques to reduce misses
1. Larger block size to reduce miss rate—The simplest way to reduce the miss rate is to take advantage of 5. Giving priority to read misses over writes to reduce miss penalty—A write buffer is a good place to
spatial locality and increase the block size. Larger blocks reduce compulsory misses, but they also increase the implement this optimization. Write buffers create hazards because they hold the updated value of a location
miss penalty. Because larger blocks lower the number of tags, they can slightly reduce static power. Larger needed on a read miss—that is, a read-after-write hazard through memory. One solution is to check the
block sizes can also increase capacity or conflict misses, especially in smaller caches. Choosing the right block contents of the write buffer on a read miss. If there are no conflicts, and if the memory system is available,
size is a complex trade-off that depends on the size of cache and the miss penalty. sending the read before the writes reduces the miss penalty. Most processors give reads priority over writes.
2. Bigger caches to reduce miss rate—The obvious way to reduce capacity misses is to increase cache capacity. This choice has little effect on power consumption.
Drawbacks include potentially longer hit time of the larger cache memory and higher cost and power. Larger
6. Avoiding address translation during indexing of the cache to reduce hit time—Caches must cope with the
caches increase both static and dynamic power.
translation of a virtual address from the processor to a physical address to access memory.
3. Higher associativity to reduce miss rate—Obviously, increasing associativity reduces conflict misses. Greater
associativity can come at the cost of increased hit time. As we will see shortly, associativity also increases
power consumption.
4. Multilevel caches to reduce miss penalty—A difficult decision is whether to make the cache hit time fast, to
keep pace with the high clock rate of processors, or to make the cache large to reduce the gap between the
processor accesses and main memory accesses. Adding another level of cache between the original cache and
memory simplifies the decision. The first-level cache can be small enough to match a fast clock cycle time, yet
the second-level (or third-level) cache can be large enough to capture many accesses that would go to main
memory.
Virtual Memory Virtual Memory

• In this scheme, whenever some pages needs to be loaded in the main • Demand Paging is a popular method of virtual memory management.
memory for the execution and the memory is not available for those In demand paging, the pages of a process which are least used, get
many pages, then in that case, instead of stopping the pages from stored in the secondary memory.
entering in the main memory, the OS search for the RAM area that • A page is copied to the main memory when its demand is made or
are least used in the recent times or that are not referenced and copy page fault occurs. There are various page replacement algorithms
that into the secondary memory to make the space for the new pages which are used to determine the pages which will be replaced. We
in the main memory. will discuss each one of them later in detail.
• Since all this procedure happens automatically, therefore it makes
the computer feel like it is having the unlimited RAM.
Virtual Memory Memory Replacement Policies
Advantages of Virtual Memory First In First Out (FIFO):
The page to be replaced is the "oldest" page in the memory, the one which was loaded
before all the others
The degree of Multiprogramming will be increased.
Least Recently Used (LRU):
User can run large application with less real RAM. The page to be replaced is the one which has not been referenced since all the others have
been referenced
There is no need to buy more memory RAMs. Last In First Out (LIFO):
Disadvantages of Virtual Memory The page to be replaced is the one most recently loaded into the memory
Least Frequently Used (LFU):
The system becomes slower since swapping takes time. The page to be replaced is the one used least often of the pages currently in the memory
It takes more time in switching between applications. Optimal (OPT or MIN):
The page to be replaced is the one that will not be used for the longest period of time. This
The user will have the lesser hard disk space for its use. algorithm requires future knowledge of the reference string which is not usually available.
Thus, this policy is used for comparison studies
Example on FIFO Optimal Page Replacement Policies

Belady’s anomaly proves that it is
possible to have more page faults when
increasing the number of page frames
while using the First in First Out (FIFO)
page replacement algorithm. For
example, if we consider reference
strings 3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4,
and 3 slots, we get 9 total page faults,
but if we increase slots to 4, we get
10-page faults.
LRU Example

Cacche

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cacche

Uploaded by

Copyright:

Available Formats

Memory Hierarchy

• Goal: Illusion of large, fast,

Cache Memory • Solution: Smaller, faster

Memory Hierarchy Memory Hierarchy Properties

Cache Effective Access Time Cache Mapping

K- Way Set Associative Mapping Numericals

Virtual Memory Virtual Memory

Example on FIFO Optimal Page Replacement Policies

You might also like