You are on page 1of 3

TUTORIAL 12 Course Title: Computer

Architecture

1. A CPU with 32-bit main memory addressing system, has a 4-way set associative cache of
size 64KB. The block size is 8 words (each word is 4 bytes).
i) Show the address subdivision(tag/index/offset) of a main memory reference (show
stepwise calculations)
ii) Draw the structure of the cache organization
iii) Calculate the total size of cache in bits

i) Show the address subdivision(tag/index/offset) of a main memory reference (show


stepwise calculations)
Block Size (in Bytes) = 8 W * 4 B/w =32B

No of cache blocks = CacheSize/BS = 64KB/32B= 2048 Blocks


4 Way Set Associative = 4 Blocks /Set
No of Sets = No of Cache Blocks/Blocks per set = 2048/4 =512 Sets
No of bits for Set index = log(512) = 9. (to locate a set)
No of bits for Block offset = log(32) 5
No of bits for tag = MM addr – (tag + block + byte) = 32 –(9+5) = 18.

Tag Set Index BlockOffset


18 9 5

ii) Draw the structure of the cache organization


V Tag Data (8w*4B) V Tag Data (8w*4B) V Tag Data (8w*4B) V Tag Data (8w*4B)
1 18 32*8 =256b 1 18 32*8 =256 1 18 32*8 =256 1 18 32*8 =256
Set
0
Set
1
Set
511
iii) Calculate the total size of cache in bits
Bits for 1 block = (1 + 18 + 8*4*8) ( 8 words/block, 4 bytes/word, 8 bits/byte)
Total cache size (in bits) = 512 sets * 4 blocks/set * (1 + 18 + 8*4*8) bits /block
= 563200 bits
= 70400 Bytes
= 68.75 KBytes (64 KB cache has an overhead of 4.75 KB)
2. A CPU has base CPI of 2 without stalls and I-Cache, D-Cache miss rate of 4% and 6%
respectively. The miss penalty of a memory reference is 100 clock cycles. The loads & stores
contribute to 30% of the instructions. Compute.
i) CPI of the CPU with the stalls and how much faster a processor would run with a perfect
cache that never missed
ii) If the CPU is optimized so that the base CPI is 1, how much faster a processor with a
perfect cache that never missed would run?

i) Actual CPI of the CPU


Miss cycles/Instr : I-Cache = 0.04 * 100 = 4, D-Cache : 0.30 * .06 * 100 = 1.80
Actual CPI = Base CPI + I-Cache Miss Cycles/Instr + D-Cache Miss Cycles/Instr
= 2 + 4 + 1.80 = 7.80.
Since there is no change in instruction count or clock rate, the ratio of the CPU
execution times is

= 7.80/ 2 = 3.9 (i.e.) a processor with a perfect cache that never missed would run 3.9
times faster than a processor with a cache that would miss

ii) if the CPU is optimized so that the base CPI is 1, how much faster a processor with a
perfect cache that never missed would run

Optimized CPU Actual CPI = 1 + 4 + 1.80 = 6.80.

6.80/1 = 6.80 (i.e.) a processor with a perfect cache that never missed would run 6.8 times
faster than a processor with a cache that would miss

3. A CPU has base CPI of 2 without stalls and I-Cache, D-Cache miss rate of 2% and 4%
respectively. The miss penalty of a memory reference is 100 clock cycles. The loads & stores
contribute to 36% of the instructions. Compute.
i) CPI of the CPU with the stalls and how much faster a processor would run with a perfect
cache that never missed
ii) If the CPU is optimized so that the base CPI is 1, how much faster a processor with a
perfect cache that never missed would run?

i) Miss cycles/Instr : I-Cache = 0.02 * 100 = 2, D-Cache : 0.36 * .04 * 100 = 1.44
Actual CPI = Base CPI + I-Cache Miss Cycles/Instr + D-Cache Miss Cycles/Instr
= 2 + 2 + 1.44 = 5.44
Since there is no change in instruction count or clock rate, the ratio of the CPU
execution times is

= 5.44/ 2 = 2.72 (i.e.) a processor with a perfect cache that never missed would run 2.72
times faster than a processor with a cache that would miss
ii) if the CPU is optimized so that the base CPI is 1, how much faster a processor with a
perfect cache that never missed would run

Optimized CPU Actual CPI = 1 + 2 + 1.44 = 4.44

4.44/1 = 4.44 (i.e.) a processor with a perfect cache that never missed would run 4.44 times
faster than a processor with a cache that would miss

You might also like