You are on page 1of 10

The Hong Kong Polytechnic University

Department of Electrical and Electronic Engineering

EIE3343 Computer Systems Principles

Laboratory Exercise 2: Cache system

Objectives: To study how a cache system works.

After completing this experiment, you should know the organization of different cache systems and how a
cache system operates with its cache management system.

Software: The CacheSim simulation program

Introduction:

A cache is a high-speed memory system between the microprocessor and the DRAM memory system. A
buffer allows a computer system to function more efficiently with DRAM of lower accessing speeds. It can
improve the overall performance of a memory system when data are accessed more than once.

The structure of a cache system is as follows. The system memory space is partitioned into several blocks.
Blocks are then grouped into several sets, and the blocks in the same set compete to occupy a fixed number
of block buffers (= the number of ways) in the cache. The data movement between the cache and system
memory is block-oriented.

As an example, Figure 1 shows a 4-way set-associate cache system. A cache entry consists of a cache
directory entry and the corresponding cache memory entry. The cache directory contains information such as
what is stored in the cache. The data is stored as a cache memory entry in a buffer. The 3-bit LRU entry
determines which buffer in the selected set should be replaced after a cache miss. In the CacheSim program,
a counter is used for each block (slot) of the cache memory to record the period for which the block has not
been accessed.

When a datum is accessed, a cache system operates in a way as follows:

1. Determine the set address and the tag address of the datum.
2. Check the tag fields of all cache directory entries of the corresponding set against the tag address
simultaneously.
3. If there is a match in any one of the tag fields and the corresponding valid bit is active, there is a
cache hit, or else a cache miss.

1
Figure 1: A 4-way set-associative cache memory.

Figure 2 summarizes the policy adopted in a cache system to operate according to different accessing results.

Figure 2: Cache read and write policy

2
The performance of a cache memory is measured by Hit Ratio (HR) and Effective Access Time (EAT),

No. of hits
HR=
Total No. of memory accesses

( No. of hits)×T h +( No. of misses )×T m


EAT =
Total No. of memory accesses
where Th and Tm are, respectively, the access time for a hit and a miss.

If the access times for read and write operations are different, EAT can be defined as

(No . of read hits )×T h−r +( No. of write hits )×T h−w +(No . of read misses)×T m−r +(No . of write misses )×T m−w
EAT =
Total No . of memory accesses
where
T h−r : the access time for a read hit
T h−w : the access time for a write hit
T m−r : the access time for a read miss
T m−w : the access time for a write miss

3
Method and details:

In this lab, you will study the behavior of cache memory systems and cache management systems with a
simulation software called CacheSim. The default cache system simulated in CacheSim is a 4-way, 16-set
associate cache system. It consists of 64 slots, each containing four 16-bit data words and some bits for a tag
and replacement algorithm statistics. The cache uses a write-through write policy and a least-recently-used
replacement algorithm. Memory is addressable at a 16-bit word boundary (word-addressable, i.e., each main
memory address references one 16-bit word). Main memory consists of a maximum of 128K bytes (or 64K
words). In other words, a memory address requires 16 bits.

The simulator reads memory references from a default data file called 'pgmcode-small.txt'. Each line in the
file starts with a “W” to indicate a write or an “R” to indicate a read. Write lines are of the form:

W address word

where “address” is a 16-bit address in decimal and “word” is the 16-bit word in decimal to be written. Read
lines are of the form

R address

where “address” is the main memory address where the read needs to come from. The access time in the
cache system is given as follows.
1. If a referenced word is in the cache, it takes one cycle to read (
T h−r ) and four cycles to write (T h−w ).
T
2. If a referenced word is not in the cache, it takes 81 cycles to read ( m−r ) and 84 cycles to write ( m−r
T
).
Figure 3 shows the information that can be provided in the simulator. Most of them are self-explanatory. The
“Tag”, “Set”, and “Offset” are computed based on the instruction provided in the data file respectively.
“ICtr” stands for instruction counter. All cache memory entries are hidden as they are not the concern in the
simulation. The numbers in blue tell the LRU information of the corresponding cache entries. The color of
the tag field in a cache entry tells whether the data in the associated cache memory entry is valid (red) or not
(black).

4
Figure 3: A snapshot of what the simulator provides.

5
The Hong Kong Polytechnic University

Department of Electrical and Electronic Engineering

EIE3343 Computer Systems Principles

Laboratory Exercise 2: Cache system

Student Name: ___________________

Student No.: ___________________

Date: ___________________

1. With the provided small data file ‘pgmcode-small.txt’ (50 read/write operations). Modify the filename in
the MATLAB program CacheSim.m if necessary (Line 145: fid = fopen('pgmcode-small.txt','r');). Trace
the simulation program to understand how the set-associative cache system operates with the default
parameter setting: a 4-way 16-set cache system. Complete Table 1.

Table 1: Tracing read/write operations (a 4-way 16-set cache system)


ICtr Address Tag Set Offset Accumulated Access
(HEX) Field Field (Word Field, Number of Time
(DEC) (DEC) DEC) Hits (cycles)
0 0 0 0 0 0 0
5 6H
10
15
20
25
30
35
40
45
49
50

Based on the results shown in Table 1, calculate HR and EAT:

Hit Ratio (HR) =


Effective Access Time (EAT) =

6
2. Run the simulation program with the small data file ‘pgmcode-small.txt’ and complete Tables 2-1 and 2-
2.
Table 2-1
Cache Memory ICtr Address Tag Field Set Field Offset (Word
Organization (HEX) (Binary) (Binary) Field, Binary)
4-way 8-set 14 20 00000000001 000 00

2-way 16-set 24

2-way 8-set 34
Direct mapped 16 39
sets (one way)
Direct mapped 8
44
sets (one way)

Table 2-2
Direct Direct
Cache system 4-way 4-way 2-way 2-way mapped mapped
organization 16-set 8-set 16-set 8-set 16 sets 8 sets
(one way) (one way)
Number of tag bits

Number of set bits


Number of offset
2 2 2 2 2 2
(word) bits
Size of the cache
memory (in bytes) for
storing the data from
the main memory
Hit Ratio (HR)
Effective Access Time
(EAT) (unit: clocks)

7
3. Run the simulation program with the large data file ‘pgmcode-large.txt’ (Line 415 of CacheSim.m) and
complete Table 3.

Table 3: Simulation results with the large data file ‘pgmcode-large.txt’ (about 50,000 read/write operations)

Case 1 2 3 4 5 6
Direct Direct
Cache system 4-way 4-way 2-way 2-way mapped mapped
organization 16-set 8-set 16-set 8-set 16 sets 8 sets
(one way) (one way)
Hit Ratio (HR)
Effective Access Time
(EAT) (unit: clocks)

Questions:

(1) Based on Table 3, comment on the effect of the size of cache memory on the HR when running a
relatively large program.

(2) Based on Table 3, comment on how the organization of cache memory affects the cache performance
(HR and EAT).

(3) What is the relationship between HR and EAT?

8
(4) Comment on how the size of a program affects the cache performance (HR and EAT) of a cache
system by comparing the results of the data file ‘pgmcode-small.txt’ with those of the data file
‘pgmcode-large.txt’. Why?

9
4. Assume that a w-way (w=¿ 2, 4, or 8) set associative cache memory (take Figure 1 as an example) is
used. Each set has three LRU bits, one valid bit, and one write-protect bit for each cache way (block).
Derive the formulas for the size of the cache directory in bits (
Sd ), the size of the cache memory in bits (
Sm ), and the total number of bits in the cache system ( Sc =S d + Sm ) in terms of the number of bits in
the set field (
N s ), the offset (byte) field ( N o ), and the address (N). Assume that it is a byte-addressable
computer.

Hint:

The number of bits in the tag field = N – Ns – No.

The total number of bits for a set in the cache directory


= (The number of bits in the tag field + 1 valid bit + 1 write-protect bit) × The number of ways

There are three LRU bits for each set, but not each way. The LRU bits of a set are to identify which of
these blocks (ways) in the set is the LRU block. Thus, three LRU bits are enough to identify the LRU
block if w is not more than eight.

The size of the cache directory, Sd


= (The total number of bits for a set in the cache directory + 3 LRU bits) × The total number of sets

The size of the cache memory, Sm


= The total number of ways × The total number of sets in the cache × The total number of bits in a block

- End -
Lawrence Cheung
January 2024
10

You might also like