You are on page 1of 10

 

School of Engineering
Department of Computer and Communication Engineering
Fall 2015 – 2016

Course: CENG 400, Computer Organization and Design

Instructors: Dr. Zaher Merhi, Dr. Ali Ghouwayel, Dr. Ayman Khalil, Dr. Ali Bazzi, Dr.
Abdelmehsen Ahmad, Dr. Ousama Tahan, Dr. Reda Shbib

Date: Thursday February 4, 2016 Time: 09:00 to 11:00

Time: 120 minutes

Student Name: _________________________ Section________ Campus __________

There are three Questions in the booklet each has several parts, please answer all parts of the 3
questions to the best of your ability.
Marking Scheme:

Questions Weight Mark


Question 1 20 points
Question 2 40 points
Question 3 40 points
Total 100 points
 
1. This booklet contains 10 pages including this one + 2 page MIPS Reference Sheet
(not numbered). Make sure you have all these pages.
2. Closed Book Examination. Calculators are allowed
3. The cheating penalty will be “F” in the exam
Good Luck
 

Page 1 of 10 
 
Question 1: Single Cycle Data Path (20 points)

a) Consider the data path below for a single cycle 32-bits MIPS processor
Assume that we are executing the following instruction

- add $t0, $t1, $t2


- Note that the PC and the content of registers $t0 and $t1, $t2 are found in bottom
left corner of the figure below

4
2


$t0 = 0x00018FCA 

$t1 = 0x000076FC 

$t2 = 0x0000A5B0 

PC = 0x000016F0 

(Before executing add) 

Page 2 of 10 
 
a) Fill the table below with the content of the lines indicated by the number on the figure (12
Points). Write down all bits in binary or Hex.

1
2
3
4
5
6

b) For each of the following instructions fill the respective tables (8 points)

add   

RegDst Branch
MemRead MemtoReg
ALUOp MemWrite
ALUSrc RegWrite
 
sw  

RegDst Branch
MemRead MemtoReg
ALUOp MemWrite
ALUSrc RegWrite

slt 

RegDst Branch
MemRead MemtoReg
ALUOp MemWrite
ALUSrc RegWrite

beq  

RegDst Branch
MemRead MemtoReg
ALUOp MemWrite
ALUSrc RegWrite

Page 3 of 10 
 
Question 2: MIPS Pipeline (40 points)

a) Consider the following MIPS program that was run on 5 stage MIPS pipeline processor
lw $t0, 0($s2)
sw $t0,0($s1)
add $t0, $t0, $t1
sw $t0, 4($s1)
addi $s1,$s1,-4

i. Assume that there is no forwarding employed, fill the table below. Indicate Stalls by
writing ST in the Clock cycle (C) where it occurs. (10 points)

Clock cycle
Instruction C1 C2 C3 C4 C5 C6 C7 C8 C9 C C C 12 C 13 C 14 C 15 C 16 C 17 C 18
10 11

Page 4 of 10 
 
ii. Assume Now that forwarding is employed, fill the table below. Indicate Stalls by writing
ST in the Clock cycle (C) where it occurs and Forwarding by an arrow() (12 points)
 

Clock cycle
Instruction C1 C2 C3 C4 C5 C6 C7 C8 C9 C C C 12 C 13 C 14 C 15 C 16 C 17 C 18
10 11

Page 5 of 10 
 
b) Consider the following MIPS program:

Loop: lw $t0, 0($s2)


sw $t0,0($s1)
add $t0, $t0, $t1
sw $t0, 4($s1)
addi $s1,$s1,-4
addi $s2,$s2,-4
bne $s1, $0, Loop

i. Assume that the above program is run on a single issue 5 stage pipeline
processor where forwarding and Stalls are employed, Calculate the CPI( clock
per instruction). (6 points) Detail your answer

ii. Assume that the above program is run on a two issue (VLIW) pipeline processor
where forwarding and Stalls are employed but NO loop unrolling is allowed,
calculate the CPI. Detail your answer. It is also assumed that in the same 2-issue
pack, Data transfer instruction can be only combined with ALU or branch
instruction ( 6 points)
Note: the number of rows does not relate to the correct solution

ALU or Branch Data Transfer Clock Cycles

Page 6 of 10 
 
iii. Assume that the above program is run on a two issue (VLIW) pipeline processor
where forwarding and Stalls are employed but loop unrolling is now allowed,
however, your are only allowed to unroll it twice. Calculate the CPI. Detail your
answer. It is also assumed that in the same 2-issue pack, Data transfer instruction
can be only combined with ALU or branch instruction ( 6 points)
Note: the number of rows does not relate to the correct solution

ALU or Branch Data Transfer Clock Cycles

Page 7 of 10 
 
Question 3: Memory Organization and Design (40 points)

a) Consider a Direct Mapped cache with 32-bit memory address reference word addressable.
Assume a 2 word block and a Cache size of 8 locations.
i. What is the number of tags bits and index bits (2 points)

ii. Starting with an empty cache, assume the following memory address references (word
addressable) in the corresponding order
 
Address 3 180 43 2 191 88 190 14 181 44 186 253
 

1. Fill the table below ( 6 points)

 
Address 3 180 43 2 191 88 190 14 181 44 186 253
Block
number
Hit/Miss

2. List the final state if the cache. List only valid entries (containing data
by filling the following table. Where the data field corresponds to
MEM[address] for example: MEM[16]. ( 4 points)
 

Valid Bit Index Tag Data


0
1
2
3
4
5
6
7
 

Page 8 of 10 
 
b) A processor runs at 2 GHz and has a CPI of 1.2 without including the stall cycles due to
cache misses. Load and store instructions count 30% of all instructions. The processor has an
I-cache and a D-cache. The hit time is 1 clock cycle. The I-cache has a 2% miss rate. The D-
cache has a 5% miss rate on load and store instructions. The miss penalty is 50 ns, which is
the time to access and transfer a cache block between main memory and the processor. (18
points)
i) What is the average memory access time for instruction access in clock cycles?
(4 points)

ii) What is the average memory access time for data access in clock cycles? (2
points)

iii) Given the CPI is 1.2 with no stalls ( i.e. a perfect cache) Calculate the new CPI
taking into consideration stalls occurring from I-cache miss and D-cache miss? (4
points)

iv) You are considering replacing the 2 GHz CPU with one that runs at 4 GHz, but is
otherwise identical. How much faster does the new processor run? (8 points)

Assume that hit time in the I-cache and the D-cache is 1 clock cycle in the new
processor, and the time to access and transfer a cache block between main memory
and the processor is still 50 ns.

Page 9 of 10 
 
c) Assume you have a cache with 16K blocks and each block contains 4 words and a 32-bit byte
address. Find the total number of sets and the total number of Tag bits needed for the entire
cache if the cache was: ( 6 points)
i) 2-way set associative

ii) 4-way-set associative

iii) Fully associative


 

Page 10 of 10 
 

You might also like