You are on page 1of 34

Memory Systems and

Organization

Lecture – 4.2

Cache Mapping
CACHE MEMORY MAPPING
There are three commonly used methods to
translate main memory addresses to cache
memory addresses.
Direct-Mapped Cache
Associative Mapped Cache
Set-Associative Mapped Cache
The choice of cache mapping scheme affects
cost and performance, and there is no single
best method that is appropriate for all situations
Direct Mapped Caching

 For each item of data at the lower level,


there is exactly one location in the cache
where it might be - so lots of items at the
lower level must share locations in the upper
level

 Address mapping:
(block address) modulo (# of blocks in the
cache)
Example
Assume the CPU generates 32 bits address, and we have a 1 K
word (4Kbyte) direct mapped cache with block size equals to 4
bytes (1 word). In other words, each block associated with the
cache tag will have 4 bytes in it (Row 1).
With Block Size equals to 4 bytes, the 2 least significant bits of
the address will be used as byte select within the cache block.
Since the cache size is 1K word, the upper 32 minus 10+2 bits,
or 20 bits of the address will be stored as cache tag.
The rest of the (10) address bits in the middle, that is bit 2
through 11, will be used as Cache Index to select the proper
cache entry

4 15-213, S’08
Direct Mapped Cache Example
 One word/block, cache size = 1K words (4Kbyte)
Byte
31 30 ... 13 12 11 ... 2 1 0
offset

Tag 20 10 Data
Hit
Index
Index Valid Tag Data
0
1
2
.
.
.
1021
1022
1023
20 32
Cache Example - 2
These notes use an example of a cache to
illustrate each of the mapping functions. The
characteristics of the cache used are:
Size: 64 kByte
Block size: 4 bytes – i.e. the cache has 16k
(214) lines of 4 bytes
Address bus: 24-bit– i.e., 16M bytes main
memory divided into 16M/4 = 4M blocks of
4 words each
Direct Mapping Example - 2

Tag s-r Line or slot r Word w


8 14 2
24 bit address
2 bit word identifier (4 byte block)
8 bit tag (=22–14)
14 bit slot or line
No two blocks in the same line have the same
tag
Check contents of cache by finding line and
comparing tag
Direct Mapping Cache Organization
Associative Mapping character
A main memory block can load into any line of
cache
Memory address is interpreted as:
 Least significant w bits = word position within
block
 Most significant s bits = tag used to identify which
block is stored in a particular line of cache
Every line's tag must be examined for a
match
Cache searching gets expensive and slower
Associative Mapping Address
Structure Example
Tag – s bits Word – w bits
(22 in example) (2 in ex.)

22 bit tag stored with each 32 bit block of


data
Compare tag field with tag entry in cache to
check for hit
Least significant 2 bits of address identify
which of the four 8 bit words is required from
32 bit data block
Fully Associative Cache Organization
Associative Mapping Summary
Address length = (s + w) bits

Number of addressable units = 2s+w words or


bytes

Block size = line size = 2w words or bytes

Number of blocks in main memory = 2s+ w/2w =


2s

Size of tag = s bits


Set Associative Mapping character

Address length is s + w bits

Cache is divided into a number of sets, v = 2d

d – set bit length

k lines in a cache is called a k-way set associative


mapping

Number of lines in a cache = v•k = k•2d

Size of tag = (s-d) bits


Set Associative Mapping (continued)

 Hybrid of Direct and Associative


k = 1, this is basically direct mapping
v = 1, this is associative mapping
 A given block maps to any line within its specified set –
e.g. Block B can be in any line of set i.
 2 lines per set is the most common organization.
 Called 2 way associative mapping
 A given block can be in one of 2 lines in only one
specific set
 Significant improvement over direct mapping
K-Way Set Associative Cache Organization
Set Associative Example

Answer the question based on a 2-way set associative cache


with 4K lines, each line containing 16 words, with the main
memory of size 256M memory space (28-bit address):

What will their tag, set and word field length?

Tag s-r Set s Word w


13 11 4
Replacement Algorithms
There must be a method for selecting which line
in the cache is going to be replaced when there’s
no room for a new line

Direct mapping
 There is no need for a replacement algorithm with
direct mapping

 Each block only maps to one line

 Replace that line


Associative & Set Associative
Replacement Algorithms
 Least Recently used (LRU)

 Replace the block that hasn't been touched in the


longest period of time

 Least frequently used (LFU) – replace block which has had


fewest hits.

 Random – only slightly lower performance than use-based


algorithms LRU, FIFO, and LFU
Virtual Memory
Review: Memory Hierarchy
 By the principle of locality the system pretend as if a large
memory is available in the cheapest technology at the
speed offered by the fastest technology.
Memory

Cache
Registers

 Question: What if we want to support programs that require


more memory than what’s available in the system?
Memory Hierarchy

Virtual Memory
Memory

Cache
Registers

 Answer: Pretend we had something bigger: Virtual Memory


Virtual Memory
Use main memory as a “cache” for secondary
memory.
Program can “pretend” it has main memory of
the size of the disk – which is bigger than the
actual physical memory
 Allows efficient and safe sharing of memory among
multiple programs
 Provides the ability to easily run programs larger
than the size of physical memory
What makes it work? – again the Principle of
Locality
 A program is likely to access a relatively small
portion of its address space during any period of
Virtual Memory
Partition HDD in to pages (2k, 4k, 8k …) and map a few to RAM
others to disk.
Keep hot pages in to RAM
Assume :

RAM = 1GB (230)


HDD = 4GB (232)
page size is 212 = 4KB
the number of physical pages allowed in memory is 218,
the physical address space is 1GB and
the virtual address space is 4GB
HOW DO WE KNOW PAGE PRESENCE IN
RAM?
Address Translation
 A virtual address is translated to a physical address by a
combination of hardware and software
Virtual Address (VA)
31 30 . . . 12 11 . . . 0
Virtual page number Page offset

Translation

Physical page number Page offset


29 . . . 12 11 0
Physical Address (PA)
 So each memory request first requires an address translation
from the virtual space to the physical space
 A virtual memory miss (i.e., when the page
is not in physical memory) is called a page
fault
Address Translation

virtual address

CPU p d f

physical address d
f d

Memory
page table
Address Translation Mechanisms
Virtual page # Offset

Physical page #
Offset
Physical page
V base addr
1
1
1
1
1
1
0
1 Main memory
0
1
0
Page Table
(in main memory)
Disk storage
Address Translation Mechanisms
Thus it takes an extra memory access to translate a VA to a PA

VA PA miss
Trans- Main
CPU Cache
lation Memory
hit
data

Virtual Addressing with a Cache


 The hardware fix :
• Uses a Translation Lookaside Buffer (TLB) –
•a small cache that keeps track of recently used
address mappings to avoid having to do a page
table lookup
Making Address Translation Fast
Virtual page # Physical page
V Tag base addr
1
1
1
0
1
TLB
Physical page
V base addr
1
1
1
1
1
1
0
1 Main memory
0
1
0
Page Table
(in physical memory)
Disk storage
A TLB in the Memory Hierarchy
¼ t hit ¾t
VA PA miss
TLB Main
CPU Cache
Lookup Memory
miss hit

Trans-
lation
data
 A TLB miss – is it a page fault or merely a TLB miss?
 If the page is loaded into main memory, then the TLB miss can
be handled (in hardware or software) by loading the
translation information from the page table into the TLB
- Takes 10’s of cycles to find and load the translation info into the TLB
 If the page is not in main memory, then it’s a true
page fault
- Takes 1,000’s of cycles to service a page fault
 TLB misses are much more frequent than true page faults
TLB Event Combinations
TLB Page Cache Possible? Under what circumstances?
Table
Hit Hit Hit
Hit Hit Miss

Miss Hit Hit


Miss Hit Miss

Miss Miss Miss


Hit Miss Miss/
Hit
Miss Miss Hit
TLB Event Combinations
TLB Page Cache Possible? Under what circumstances?
Table
Hit Hit Hit Yes – what we want!
Hit Hit Miss Yes – although the page table is not
checked if the TLB hits
Miss Hit Hit Yes – TLB miss, PA in page table
Miss Hit Miss Yes – TLB miss, PA in page table, but data
not in cache
Miss Miss Miss Yes – page fault
Hit Miss Miss/ Impossible – TLB translation not possible if
Hit page is not present in memory
Miss Miss Hit Impossible – data not allowed in cache if
page is not in memory
Address Translation via TLB
Example
An address translation process converts a 32-bit virtual address to a
32-bit physical address. Memory is byte-addressable with 4 KB pages.
A 16-entry, direct-mapped TLB is used. Specify the components of the
virtual and physical addresses and the width of the various TLB fields.
Solution Virtual Byte
page number offset
Virtual Virtual
Valid
Page number address
bits
20 12 TLB word width =
TLB tags
16 4

Translation
16-bit tag +
Tag Tags match
and ent ry 20-bit phys page # +
TLB
index
is valid 1 valid bit +
Other flags
16-entry Physical 20 12  37 bits
page number Physical
TLB Other
flags address

Physical Byte offset


address tag in word
Cache index

 
Example - 3
• Assuming that TLB is a direct-mapped structure with 64 entries. The page
size is 4KB. The size of virtual address is 48-bit, and the size of physical
address is 44-bit.
• Draw the TLB structure in detail. Specify which parts of the virtual
address are used for virtual page number and page offset, how to index
the TLB, and how to make physical address.
Exercise
Consider a computer with a memory address of 20 bits.
The computer uses virtual memory with pages of 1KB. The
main memory has a capacity of 256 KB. Calculate:
a) What is the virtual address format?
b) How many entries does the page table (one level)
have?
c) How many frames does the main memory have?
d) Which are the fields included in the page table?
What is the usage of these fields?

You might also like