You are on page 1of 7

CPU Performance Evaluation

Friday, July 09, 2021 1:35 PM

CPI : Cycles Per Instruction

Clock : Every computer runs with a clock that runs at a


constant clock rate or clock frequency(f).

1Hz clock rate means 1 cycle completed per sec

1KHz= 10^3 Hz
1MHz= 10^6 Hz
1GHz = 10^9 Hz
1THz = 10^12 Hz

A computer machine (ISA) instruction is comprised of a


number of elementary or micro operations which vary in
number and complexity depending on the instruction
and the exact CPU organization (Design).

CPI tells how many cycles are required to complete an


instruction.

Effective CPI or Average CPI

For a Computer we are interested to calculate the effective


Cpi because different types of instruction takes different
CPI for completion

Computer Performance measure :

Program Execution Time

Depends upon the Average CPI of the program.


Machine A 1 Hz clock rate
Average CPI X clock cycle time = Time to complete one
Machine B 2 Hz clock rate
instruction

Program contains n instruction

Program Execution Time = Example :

Average CPI * Clock Cycle time * no of instruction A 2Hz machine runs a


program that consists of
=IC * CPI * clock cycle time 100 instruction and each
instruction takes 5 cycles so
find the program execution
time

Program Execution Time


Performance of a computer depend on the
execution time. Less is the execution time better is 100 X 5 X 1/f
the performance
100 X 5 X 1/2 sec

In other word

Performance of a machine = 1 / Execution Time of


the machine
Effective CPI :
Example :

Mixed instruction set A 2Hz machine runs a program


that consists of 100 instruction
Arithmetic Instruction --- -X no cycle /Inst and each instruction takes 5
cycles so find the program
Data Transfer Instruction ---y no cycles/Inst execution time

Control flow Inst ------------z no of cycles /Inst Program Execution Time

= IC X Avg. CPI X clock cycle


time

A 5Hz machine runs a program that consists of 1000 100 X 5 X 1/f


instruction. It 's found that 50% instructions are the data
transfer instruction and 30 % instruction are arithmetic and 100 X 5 X 1/2 sec
remaining control instruction. If CPI of data transfer ,
arithmetic and control instruction are 5, 6 and 7 cycles per
instruction then find the , effective CPI ,CPU execution time .

In the above problem, Instruction set is mixed type

Effective CPI or Average CPI


Alternative effective or Average CPI
Type of inst Fraction of CPI CPI * fraction
Inst
Total no of instruction is 1000
Data transfer 0.5 5
Arithmetic 0.3 6 Data transfer instructions are 50% means 500
Control 0.2 7 instructions are there.

CPI of Data transfer is 5 so the number of clock


Cycle required for 500 instruction are 500x 5 cycle

Arithmetic inst are 30% means 300 instructions of the


Effective or the Average CPI of the Program program are this category

CPI of arithmetic is 6 so the number of clock


=
Cycle required for 300 instruction are 300x 6 cycles

Remaining 20% or 200 instructions are control


Frequency or clock rate(f) = 5Hz instruction
Clock cycle time C =1/f =1/5 sec
Execution Time = IC X Effective CPI X Clock Cycle CPI of control inst is 7 so the number of clock
time Cycle required for 200 instruction are 200x 7 cycle

Effective CPI = 500 X 5 + 300 X 6 + 200 X 7


= 1000 X 5.7 X 1/5 sec
--------------------------------
1000
= Total no of Clock Cycle / Total no of instruction

= 5.7

Performance of a computer depend on the execution time. Less is the


execution time better is the performance

In other word

Performance of a machine = 1 / Execution Time of the machine


Speed Up factor is used to compare the performance of two machine.
Ex : we have 2 machine : Machine A and Machine B

Speed UP = Performance of Machine A Execution time of B


Speed UP = Performance of Machine A Execution time of B
-------------------------------- = -----------------------
Performance of Machine B Execution time of A

Machine A is 5 times faster then Machine B

Speed UP = 5 = Performance of Machine A


-------------------------------
Performance of Machine B

Performance
Example of AA =
: Machine 5 X Performance
having of Machine
the Specification : 5HzBmachine , effective CPI
is 5.2 Machine B having the Specification : 8Hz machine , effective CPI is
7.2. Find the Speedup factor of the machine assume the number of IC for
both the machine is same

Performance of Machine A
Execution time of B
Speed UP =--------------------------------
= -----------------------
Performance of Machine B
Execution time of A

IC X CPI X C
Speedup = ---------------
IC X CPI X C

Factor impacting the Speedup : -----> Compiler


----> ISA
-----> VLSI

Example : Let assume we have a machine (A) of 1MHz clock rate that runs a
bench program consists of 1000 instruction with CPI is 2. Let we want to improve
the performance of this machine by changing some hardware that leads to
change the clock rate to 2Mhz with new CPI is 1.5 and also we are optimizing
the compiler of the machine so that the number of instruction of the benchmark
program change to 900. Then find out the Speed Up in performance of this
machine .

Speed up = Performance of New machine Execution time old machine


---------------------------------- = ----------------------------------
Performance of old machine Execution time of new machine

= 1000 X 2 X 1/(1 X10 ^6)


--------------------------- = a
900 X 1.5 X 1/( 2 X 10 ^6)
MIPS rating : Millions instruction per sec

MIPS rating = Instruction count(IC)


-------------------------
Execution Time X 10 ^6
= IC
----------------------
( IC X Avg. CPI X C X 10 ^6)

= 1/ (Avg. CPI X C X 10 ^6)

= f / (avg. CPI X 10 ^6)

Suppose a bench mark is program is run at 100MHz clock rate . The


executed program consists of 10,000 instruction with the effective CPI
of 1.5. Find MIPs rating for the program .

MIPS rating = 100 X 10 ^6 100


------------ = ----- =66.666 MIPS
1.5 X 10 ^6 1.5

Amdahl's law : Used to find the performance gain obtained from improving the portion of a
computer(task)

Amdahl's law is an expression used to find the maximum expected improvement to an overall
system when only part of the system is improved. It is often used in parallel computing to
predict the theoretical maximum speedup using multiple processors.

Portion of task Portion of task


Serialized Parallelized

Overall Speed Up = Performance of the task with enhancement


--------------------------------------------------
Performance of the task without enhancement

= Execution time of the task without enhancement


----------------------------------------------------------- --------(1)
Execution time of the task with enhancement

Execution time with enhancement = ((1- f) + f / S ) X Execution time without enhancement

Execution time without enhancement 1


--------------------------------------- = -----------------------
Execution time with enhancement (1-f) + f/S

Continue in the next page……


overall speed up = 1
------------------
(1-f) + f/S

f =fraction or the portion of the task that is enhanced


(1-f) = portion of task without enhanced

S = Speed up factor for the enhancement

Example: Let a person covers a source to destination point in 10 hrs Without Enhancement
with walking. Suppose the narrow road only cover with walking (Travelling only with
and the remaining can be with walking, cycling Or in bike as Walking)
shown in the figure.

Next we want to find improvement in overall performance.

Assume cycling improve the 3times speed up and with bike improve
9 times enhancement as compared to walking.
Enhancement with
First we have to find the fraction of Walking + cycling
With enhanced Speed
Execution time using Cycling Up of 3 times

= (0.1 +0.9/S ) * Execution time without enhancement

= (0.1 + 0.9 /3) 10 =1+3 =4 hr Enhancement with


Walking + Riding bike
Execution time using bike With enhanced Speed
Up of 9 times
= (0.1 +0.9/S ) * Execution time without enhancement

= (0.1 + 0.9 /9) 10 =1+1 =2 hr

Overall speedup = 1
------------ = 1/ 0.4 =2.5
(0.1 + 0.9 /3)
Note: Narrow road only for walking
Wide road can be used walking
Cycling, riding bike etc.

Example :
Let a program have 40 percent of its code enhanced to run 2.3 times faster .
What is the overall system speedup S? What will be the Maximum speed up
of the machine ?

overall speed up = 1
------------------
(1-f) + f/S

F = portion of the task that is enhanced 40% =0.4


(1-F ) = 0.6
S = 2.3

Overall Speed Up = 1
---------------------- = 1.292
0.6 +( 0.4 /2.3)

Maximum Speed up = 1 1
------------------------- = -------------
0.6 + ( 0.4/ infinity) 0.6

A computer with single core execute a task such that 25% of the task is serialized
and remaining task is parallelized one. If we increase more number of core then
what is the overall Speed up

One Core = 25% + 75% = 0.25 +0.75

Two core 25% + 75% X (1/2)


Four Core 25% + 75% X (1/4) Maximum speed Up = 1/ (0.25) + 0.75/

Speed Up = 1
----------- = 1.6
0.25 +0.75/2

Four core = 1 2.28 What is the maximum core you can use
----------- =
So that the maximum speed up gain can be
0.25 +0.75/4
achieved up to 8
8core = 1 2.9
----------- = Answer is impractical as maximum
0.25 + 0.75/8 Speed up gain is 4

You might also like