Assignment 3

EG211: Computer architecture
Assignment 3: Cache: Marks 45

Deadline: Submit by July 3 2022 , 11:59pm
Start early. No deadline extensions will be given.
General instructions on what to submit:
• Can be done in groups of 2 or 3 students
• Submit
◦ the code
◦ A report showing snapshots
• There will be a demo of the assignment
• Submit individual files and not compressed files
• All the files you submit should have the following format: <roll_numbers>_filename
• Only one person in the group will submit the assignment
• If you are getting an error and unable to get the result, submit snapshot of error in the report
What should the report contain?
• Output graphs/tables
• Observations from the experiments a,b,c
Design:
This assignment should be done in python using nmigen library. The process of installation and
the syntax of nmigen can be found here.
Sample code for adding two numbers can be found here.
For the caches assignment:
Your nmigen ports are:
Input signals: address.

Output signals: hit (hit = 1 if it is a cache hit else it will be 0).
Other than the address if you want to give your custom inputs, like no.of index bits, no.of tag
bits, etc. you can pass them when you are instantiating the class and receive them in the
constructor.
Cache can be implemented as an Array of Signals.(Refer to nmigen documentation). Each

Signal of the Array might contain tag-bits, valid bits. No need of storing any data.
Addresses should be read from the given trace files and should be given as input from if
__name__==”__main__” block inside the process function.(refer to the example code).
A similar design approach can be used for branch prediction Question also.
Question 1 –Caches - [45 marks]

Problem statement:
a. Design a 4-way set associative cache of size 512kilobytes. Block size: 4 bytes. Assume a 32
bit address. Figure out how many cache lines you need. (15 marks)
You need not implement a main memory, which means you do not need to implement data in
the cache. (Implementing a 2^32 memory will be impossible!)
[ Hint: For reporting hit/miss, all you need to check is a tag match, and a valid bit, so there is no
need to have data in the cache ]
Output: You need to report hit/miss rates of the cache for the input memory trace files (5 traces)
provided in the next page
b. Increase the cache size to 2048kB and repeat the experiment. Note the change in hit/miss
rates (10 marks)
Grading for a + b:
• The code carries 10 marks.

• The report carries 5 marks.
• The demo/viva carries 5 marks.
c. Keeping the cache size at 512kB, vary the block size from 1 byte, 4 bytes, 8 bytes, 16 bytes.
Note that the number of cache lines will reduce, if you increase the block size keeping the cache
size same. Repeat the experiment for all the trace files, and note the hit/miss rates
d. Vary the associativity from 1-way to 16-way, for a fixed cache size of 512kB, and plot the
variation of hit rates.
Grading for c+d:

Inputs to your code:
Use the memory trace files at this location
https://cseweb.ucsd.edu/classes/fa07/cse240a/proj1-traces.tar.gz as input.
The trace file will specify all the memory accesses/addresses that occur in a certain program.
Each line in the trace file will specify a new memory reference and has the following fields:
• Access Type: A single character indicating whether the access is a load ('l') or a store ('s').
You can ignore this field. For reporting hit/miss, it does not matter whether it is a Load/Store
• Address: A 32-bit integer (in unsigned hexadecimal format) specifying the memory address
that is being accessed. This is the only field you need.
• Instructions since last memory access: Indicates the number of instructions of any type that
executed between since the last memory access (i.e. the one on the previous line in the trace).
For example, if the 5th and 10th instructions in the program's execution are loads, and there are
no memory operations between them, then the trace line for with the second load has "4" for
this field. You can safely ignore this field
Output from the code:
• Hit rates or miss rates for all the 5 traces for the 3 different experiments can be reported in
an excel or in the form of a table/graph in the report.
• Submit the graphs/table/excel with your observations in the report

Question 2: Branch Predictors [45 marks]
Introduction
In this assignment, you will explore the effectiveness of branch prediction. A binary
instrumentation tool is used to generate a trace of branches and their outcomes. Your task will
be to use this representative trace to evaluate the effectiveness of a few branch prediction
schemes seen in the class. To do this, you'll write a program that reads in the trace and
simulates different branch prediction schemes and different predictor sizes.
Trace and Trace Format
The trace given to you is a subset of 16 million conditional branches. As unconditional branches
are always taken, they are excluded from the trace. Each line of the trace file has two fields.
Below are the first four lines of the trace file:
3086629576 T
3086629604 T
3086629599 N
3086629604 T
The first field is the address of the branch instruction (in decimal). You will need to convert these
addresses to a 32-bit binary number to proceed with this assignment. The second field is the
character "T" or "N" for branch taken or not taken. The trace file can be downloaded from here.
Question 2a [15 marks]
We are going to look at the simple static branch prediction policies of "always predict taken" and
"always predict not taken". Write a program to read in the trace and calculate the mis-prediction
rate (that is, the percentage of conditional branches that were mis-predicted) for these two
simple schemes.
The following are needed for the report:
• Results and output screenshots

• Which of these two policies is more accurate?
Grading:

Question 2b [30 marks]
The simplest dynamic branch direction predictor is an array of 2𝑛 two-bit counters. It is advised
to follow the notation discussed in class: strongly taken (00), weakly taken (01), weakly not taken
(10), and strongly not taken (11).
Prediction: To make a prediction, the predictor selects a counter from the table using the lower-
order 𝑛 bits of the instruction's address (its program counter value). The direction prediction is
made based on the value of the counter.
Training: After each branch (correctly predicted or not), the hardware increments or decrements
the corresponding counter to bias the counter toward the actual branch outcome (the outcome
given in the trace file).
Initialization: Although initialization doesn't affect the results in any significant way, your code
should initialize the predictor to "strongly taken".
Your task is to analyze the impact of predictor size on prediction accuracy. Write a program to
simulate the two-bit predictor. Use your program to simulate varying sizes of the predictor.
Generate data for predictors with 22 , 23 , 24 , 25 ... 220 counters (the address is a 32-bit binary
number). These sizes correspond to predictor index sizes of 2 bits, 3 bits, 4 bits, 5 bits, ... 20
bits. Generate a line plot of the data using MS Excel or some other graphing program. On the y-
axis, plot "percentage of branches mis-predicted" (a metric in which smaller is better). On the x-
axis plot the log of the predictor size (basically, the number of index bits).
The following are needed for the report:
• Results and output screenshots

• The line plot described above
• What is the best mis-prediction rate obtained in the analysis carried out?
• How large must the predictor be to reduce the number of mis-predictions by
approximately half as compared to the better of "always taken" and "always not taken"?
Give the predictor size both in terms of number of counters as well as bits.
• At what point does the performance of the predictor pretty much max out? That is, how
large does the predictor need to be before it basically captures almost all of the benefits
of a much larger predictor.
Grading:

• The report carries 10 marks. Each item listed above carries 2 marks each.

Assignment 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment 3

Uploaded by

Copyright:

Available Formats

EG211: Computer architecture

Assignment 3: Cache: Marks 45

Start early. No deadline extensions will be given.

General instructions on what to submit:

• Can be done in groups of 2 or 3 students

◦ A report showing snapshots

• There will be a demo of the assignment

• Submit individual files and not compressed files

• Only one person in the group will submit the assignment

What should the report contain?

• Observations from the experiments a,b,c

Sample code for adding two numbers can be found here.

For the caches assignment:

Your nmigen ports are:

Input signals: address.

Cache can be implemented as an Array of Signals.(Refer to nmigen documentation). Each

Question 1 –Caches - [45 marks]

• The code carries 10 marks.

Grading for c+d:

• The code carries 15 marks.

Inputs to your code:

Use the memory trace files at this location

Output from the code:

• Submit the graphs/table/excel with your observations in the report

Trace and Trace Format

Question 2a [15 marks]

The following are needed for the report:

• Results and output screenshots

• The code carries 5 marks.

The following are needed for the report:

• Results and output screenshots

• The code carries 10 marks.

You might also like