Professional Documents
Culture Documents
#define ARRAY_SIZE 4
The arrays S, P and Q have 4 entries each, and hold integer values (4 bytes at each entry).
// Initial values:
// Rp = 0x1A20, Rq = 0x1C60, Rs = 0x2BA0
// Ri = 4
Page 1 of 14
Name ______ Solutions _______
This code produces the following 12 memory accesses to the cache hierarchy:
LD 0x1A20
LD 0x1C60
ST 0x2BA0
LD 0x1A24
LD 0x1C64
ST 0x2BA4
LD 0x1A28
LD 0x1C68
ST 0x2BA8
LD 0x1A2C
LD 0x1C6C
ST 0x2BAC
We will construct multiple memory hierarchies and analyze the miss rates for each.
Page 2 of 14
Name ______ Solutions _______
Question A-1
Consider the following cache hierarchy with a L1 connected to a L2, which is connected to
DRAM.
L2 is a write-back cache with respect to DRAM → dirty data from L2 has to be written back
to DRAM when the line gets evicted.
Update the state of the L1 and L2 for each of the accesses. For each entry you can write the {tag,
index} for simplicity and circle the dirty data. The writeback column specifies the cache line
whose data is being written back to DRAM.
Page 3 of 14
Name ______ Solutions _______
Page 4 of 14
Name ______ Solutions _______
12 / 12 = 100%
12 / 12 = 100%
Page 5 of 14
Name ______ Solutions _______
Question A-2
We add a Victim Cache next to the L1 cache:
All evicted data from L1 goes to L2 as before, but a copy is also retained in the Victim Cache.
Upon a L1 miss, first the Victim Cache is checked, before going to L2.
The line is brought into L1 and the evicted line from L1 is added to the Victim Cache. Victim
Caches are Exclusive with respect to L1 – i.e., either L1 or the Victim Cache will have a cache
line, never both.
AMAT = Hit Time + MissRate * Miss Penalty = 1.9 + 0.1*50 = 6.9 cycles
Page 6 of 14
Name ______ Solutions _______
Page 7 of 14
Name ______ Solutions _______
3 /12 = 25%
What structure do you think George has in mind? Briefly describe the function of this structure,
and a recipe to allocate it and access it.
Page 8 of 14
Name ______ Solutions _______
Prefetcher
There are essentially 3 cache lines: 0x1A2, 0x1C6 and 0x2BA that are being accessed all the time
one after the other.
In Part A-2.2 you would have noticed that at any time the L1 Direct Mapped Cache has one of the
lines, and the Victim cache has the other two.
The Victim Cache removes the conflict misses, but there are still compulsory misses the first
times these lines are accessed.
The idea is that once 0x1A2 is accessed, a prefetcher can bring in 0x1C6 and 0x2BA and place
them in the prefetcher. Upon any access, the cache should look up both the Victim Cache and the
prefetcher. In this case, there will only be one miss, and every other access will be a hit.
Page 9 of 14
Name ______ Solutions _______
The page size is 256 bytes, and the byte-addressed machine uses 16-bit virtual address and 16-bit
physical address. Do not worry about aliasing problems for this question.
8 x 2 x 8B = 128B
Page 10 of 14
Name ______ Solutions _______
Now, we test the cache by accessing the following virtual address. We provide the
corresponding binary number for the virtual address.
0x0151 (0000000101010001)
The TLB is Fully Associative with 4 ways and LRU replacement. The table below shows the
current TLB states and the LRU stack for the TLB.
The Cache also uses the LRU replacement policy for each set. The LRU way bit represents the
way that is least recently used. (Note that this bit should also be updated if necessary.)
Update the TLB LRU state and the Cache state after accessing 0x0151.
You may only fill in the elements in the table when a value changes from the previous table. Write
tags in hexadecimal numbers. If the memory access is a cache hit, write “Hit” in the appropriate
entry. Otherwise update the cache states as necessary.
Note that the cache uses physical tags.
Page 11 of 14
Name ______ Solutions _______
Suppose we label the 16 cache lines from A to P, as shown in the figure below for our 2-way set
associative cache. Recall that our line size is 8 bytes.
Cache Configuration
16 bytes F, N
32 bytes F, N
64 bytes F, N
Page 12 of 14
Name ______ Solutions _______
16 bytes B, D, F, H, J, L, N, P
32 bytes B, F, J, N
64 bytes F, N
We have two possible organizations for our Page Tables: Linear and Hierarchical as shown
below:
Page 13 of 14
Name ______ Solutions _______
For the Hierarchical Page Table, suppose we want the L1 page table to fit exactly within one
page. What will be the size of the Linear and Hierarchical L2 Page Tables?
Offset = 6 bits
Linear PT Index = 10 bits
Linear Page Table Size = 210 x 2B = 2KB
Offset = 6 bits
L1 PT Size = 64B => L1 PT Index = log(64B/2B) = 5 bits
L2 PT Size = 25 x 2B = 64B
Linear PT size (2KB) + one data page (64B) + one system page (64B) = 2176B
Page 14 of 14