Professional Documents
Culture Documents
Midterm Test
This question paper has 9 pages including this cover page and attachments
This study source was downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
Question 1
(a) Enumerate and describe the 3 main steps that every microprocessor executes.
(2 marks)
Solution:
Fetch à decoderà Execute.
Fetches an instruction from memory, decodes the instruction fetched and executes the instruction
(b) A program spends 75% of its time doing multiply instructions. If the multiplier is sped up by
3x, how much faster does the application run?
Solution:
The program is 2x faster. (75% of the time is 3x faster, so that 75% now takes up 1/3 of that
time, or only 25% of the original execution time. We have effectively eliminated 50% of the
total execution time so the program is 2x faster.)
Tnew = Told x(0.25+0.75/3) = 0.5 Told à Speedup Told/Tnew= 2
(c) How many memory accesses does the following code require? Explain them. (INC=increment
instruction)
(2 marks)
INC (r1)
Solution:
This instruction involves two memory accesses: The first gets the data stored in data segment and
the second store the data back in the same address in the data segment.
(d) What do VLIW, superscalar, and array processing concepts have in common?
(2 marks)
Solution:
All three execute multiple operations per cycle.
(e) A microprocessor manufacturer decides to advertise its newest chip based only on the
metric IPC (Instructions per cycle). Is this a good metric? Why or why not? (Use less than
20 words)
(2 marks)
2
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
Solution:
No, because the metric does not take into account frequency or number of executed
instructions, both of which affect execution time.
(f) If you were the chief architect for another company and were asked to design a chip to
compete based solely on this metric, what important design decision would you make (in less
than 20 words)?
(2 marks)
Solution:
Make the cycle time as long as possible and process many instructions per cycle.
(g) Assuming that the stack starts out empty. Write a stack-based program that computes
((10x8)+(4-7))2
(4 marks)
Solution:
Because the processor does not provide an instruction to compute the square of a value, you need
tom compute (10x8)+(4-7) twice.
PUSH 10
PUSH 8
MUL
PUSH 4
PUSH 7
SUB
ADD ( at this point the stack contains the first results)
PUSH 10
PUSH 8
MUL
PUSH 4
PUSH 7
SUB
ADD (at this point, the stack contains the second result)
MUL
3
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
Question 2.
(a) Two computers’ performance need to be benchmarked. For this purpose a set of different
benchmark programs are used. Table I shows the characteristics of the benchmarks in terms of
number of instructions. Compute the SPEC rating and the speed-up factor of the fastest machine
over the other. Machine A runs at 1.0 GHz and has an average Cycle Per Instruction (CPI) of 2.5.
Machine 2 runs at 1.3 GHz and its CPI is 3
Machine B:
Bench 1 Exec time = (10*3)/1300 = 0.0231 (s)
Bench 2 Exec time = (15*3)/1300 = 0.0346(s)
Bench 3 Exce time = (35*3)/1300 =0.0808 (s)
4
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
(b) Given a processor with the following Instructions, encoded using 16-bits.
Instruction Opcode 16-bit encoding Function
MOV r1, d 0000 Opcode Destination register Address R1 ß d
(4 bits) ( 4 bits) (8 bits)
MOV d, r1 0001 Opcode Source register Address d ß R1
(4 bits) (4 bits) (8 bits)
ADD r1,r2,r3 0010 Opcode Destination register Source Source R1 ßr2+r3
(4 bits) (4 bits) register register (4
(4 bits) bits)
MOV r1,#c 0011 Opcode Destination register Constant R1 ßc
(4 bits) (4 bits) (8 bits)
SUB r1,r2,r3 0100 Opcode Destination register Source Source R1 ßr2-r3
(4 bits) (4 bits) register register (4
(4 bits) bits)
JMP r1,X 1010 Opcode source register Offset (8 bits) if(r1==0) PC ß
(4 bits) (4 bits) PC+offset
Assume that you want to augment this ISA to support 20 additional and unique instructions (e.g.
MUL, AND, OR, etc..), while still keeping the instruction encoding as 16 bits. How will the
execution and encoding of the ADD instruction be affected? (Other instructions could be affected
too, but you just need to comment on how the ADD instruction will be impacted.)
(6 marks)
Solution:
The number of bits dedicated to the opcode will need to increase from 4 to 5. As such, there will
be one less bit to encode the number of the destination register or the number of one of the source
registers. Given this change:
• An extra bit will be needed to specify the opcode
• The result of the ADD must be written to registers 0 through 7 OR
• One of the source registers will only be allowed to be register 0 through 7.
5
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
Question 3
(a) Assume that to spell check a large file, 820,000,000 instructions are needed. The instructions
in the program are broken down into 4 different classes, and each class requires N clock cycles
to execute. Specific information is given in the table below.
Instruction Class Clock cycles per instruction Number of Instructions
Branch 3 150,000,000
Store 4 185,000,000
Load 5 260,000,000
ALU 4 225,000,000
If the total execution time for this program is found to be 1.57 seconds, what is the clock cycle
time of the computer on which it was run? Show your calculations.
(6 marks)
Solution:
Applying the CPU time formula:
CPU time = time/program = instru/program X cycles/instr X time/cycle = 1.57s
150,000,000 185,000,000 260,000,000
820,000,000𝑥 3 +4 +5
820,000,000 820,000,000 820,000,000
225,000,000
+4 𝑥𝑁 = 1.57𝑠
820,000,000
(b) Assume that as part of the 820,000,000 instruction spell check, 25% of all load instructions are
immediately followed by an ALU type instructions that uses the data that was just loaded. To
speed this program up, you are thinking about adding a new type of instruction. An ALU
instruction where one of the source operations is a value from memory. In particular:
• This new instruction will replace the previous 2 instruction sequence
• It will take 7 clock cycles
Will this change offer any speedup over the original design? If so, how much?
You may assume that the clock rate does not change and your answer to this question does not
depend on your answer to question 3(a)
(10 marks)
Solution:
6
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
We need to apply the CPU time formula again, but first need to calculate the new number of load,
ALI and “new type” instructions:
The # of branches remain constant 150,000,000
The # of stores remain constant 185,000,0000
The new # of loads is= (260,000,000 x 0.75) 195,000,000
The new # of ALU is = (225,000,000-65,000,000) 160,000,000
The number of new instructions is = 260,000,000x0.25 65,000,000
Total 755,000,000
7
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
Question 4
(a) Given an unpipelined processor with a 10ns cycle time, and pipeline latches with 0.5ns latency,
what are the cycle time of pipelined versions of the processor with 2, 4, 8 and 16 stages if the
datapath logic is evenly distributed among the pipeline stages? Also, what is the latency of
each of the pipeline versions of the processor?
(10 marks)
Solution:
𝐶𝑦𝑐𝑙𝑒 𝑇𝑖𝑚𝑒A@=<=>?<@>B
𝐶𝑦𝑐𝑙𝑒 𝑇𝑖𝑚𝑒;<=>?<@> + 𝑃𝑖𝑝𝑒𝑙𝑖𝑛𝑒 𝐿𝑎𝑡𝑐ℎ 𝐿𝑎𝑡𝑒𝑛𝑐𝑦
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑃𝑖𝑝𝑒𝑙𝑖𝑛𝑒 𝑆𝑡𝑎𝑔𝑒𝑠
Applying this formula gives cycle times of 5.5, 3, 1.75, 1.125ns, showing the diminishing returns
of pipelining as the pipeline latch latency becomes a significant part of the overall cycle time.
To compute the latency of each processor, simply multiply the cycle time by the number of
pipeline stages, giving latencies of 11, 12, 14, and 18ns.
(b) How long would the given code sequence and the rename sequence take to issue on an out-of-
order superscalar processor with 4 execution units, each of which can execute any operation ?
Assume all instructions have latencies of 1 cycle, use the greedy scheduling assumption, and
assume that the processor’s instruction window is large enough to cover the entire code
sequence.
(10 marks)
Solution:
Without register renaming, the sequence takes 5 cycles to issue, because instructions with a WAR
dependency can issue in the same cycle, but not out of order:
8
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Student Name: ________________________________ Student ID: ____________________
With register renaming, the sequence can be issued in 2 cycles, because we can issue instructions
that originally had WAR dependencies out of order:
Cycle 1: LD hw1, (hw2) SUB hw16, hw5, hw 6 ASH hw17, hw9, hw10 DIV hw18, hw13, hw14
Cycle 2: ADD hw3, hw4, hw1 MUL hw7, hw16, hw8 SUB hw11, hw17, hw12 ST (hw15), hw18
9
downloaded by 100000851289071 from CourseHero.com on 10-13-2022 20:16:55 GMT -05:00
This study source was
https://www.coursehero.com/file/25689555/EE6304-exam-2017-midterm-spdf/
Powered by TCPDF (www.tcpdf.org)