You are on page 1of 28

ASSESSING AND EVALUTING

PERFORMANCE

By
Sateesh D
154553

PERFORMAN
CE

Measure ,Report and Summarize


Make intelligent choices
See through the marketing hype
What factors of the system
performance is
Hard ware and soft ware related?
How does machine instruction set
affects performance?

Which has the best


performance?

Which is faster?
Which is bigger?
Which plane moves most number of
passengers in least
time

Performance Metrics

Purchasing perspective
given a collection of machines, which has the
best performance ?
least cost ?
best cost/performance?

Design perspective
faced with design options, which has the
best performance improvement ?
least cost ?
best cost/performance?

Both require
basis for comparison
metric for evaluation

Our goal is to understand what factors in the


architecture contribute to overall system
performance and the relative importance (and cost)
of these factors

Response Time and Throughput


Response time
How long it takes to do a task
Throughput
Total work done per unit time
e.g., tasks/transactions/ per hour
How are response time and throughput
affected by
Replacing the processor with a faster
version?
Adding more processors?

Measuring Execution Time


Elapsed time
Total response time, including all aspects
Processing, I/O, OS overhead, idle time

Determines system performance

CPU time
Time spent processing a given job
Discounts I/O time, other jobs shares

Can be broken up into user CPU time


and
system CPU time
Our focus: CPU USER TIME: Time spent in
executing the

Relative Performance
Performance = 1/Execution Time
X is n time faster than Y
performance x/performance y = n
Example: time taken to run a program
10s on A, 15s on B
Execution Time B / Execution Time A
= 15s / 10s = 1.5
So A is 1.5 times faster than B

CPU Time
CPU Time CPU Clock Cycles Clock Cycle Time
CPU Clock Cycles

Clock Rate

Performance improved by
Reducing number of clock cycles
Increasing clock rate
Hardware designer must often trade off
clock rate against cycle count

CPI Example

Computer A: Cycle Time = 250ps, CPI = 2.0


Computer B: Cycle Time = 500ps, CPI = 1.2
Same ISA
Which is faster, and by how much?

CPU Time
CPU Time

Instruction Count CPI Cycle Time


A
A
I 2.0 250ps I 500ps

Instruction Count CPI Cycle Time


B
B
I 1.2 500ps I 600ps

CPU Time

B I 600ps 1.2
CPU Time
I 500ps
A

Pitfall :
Expecting the improvement of
one aspect a
computer to increase
performance by an amount proportional to

Effective CPI
Computing the overall effective CPI is done by
looking at the different types of instructions and
their individual cycle counts and averaging
n

Overall effective CPIi ==


1

(CPIi x ICi)

Where ICi is the count (percentage) of the number of


instructions of class i executed
CPIi is the (average) number of clock cycles per
instruction for that instruction class
n is the number of instruction classes

The overall effective CPI varies by instruction mix


a measure of the dynamic frequency of
instructions across one or many programs

Determinates of CPU Performance

CPU time
clock_cycle

= Instruction_count x CPI x
Instruction
_count

Algorithm

CPI

Programming
language

Compiler

ISA

Processor
organization
Technology

clock_cycle

X
X

A Simple Example

Op

Freq

CPIi

Freq x CPIi

ALU

50%

.5

Load

20%

.4

Store

10%

.4

Branch

20%

.2

Overall effective CPI

1.5

What percentage of time we spend on


different instructions?
ALU = 27%
BRANCH=33%
DATA TRANSFER =40%

Performance evaluation

Programs to test performance:


performance best determined by
running
real applications
Use programs typical of expected
work load
or typical of expected class of
applications
Computer benchmarks:
Benchmark: program(s) used to
evaluate computer performance

A Look at DSP Benchmarks

Agenda
Why we need benchmarks
What makes up a good benchmark
What are some commonly used BAD
benchmarks
Choosing the right benchmark
Benchmarks used by the industry

Why Benchmarks
Want to see which processor is better
quickly. But better in what aspect?
Benchmarks can be blend of
anything
Raw speed
Power consumption
Memory usage
Cost

What should a benchmark


be?

Repeatable
Relevant
Fair
Have comparable results

MIPS/MOPS VERY BAD


Millions of Instruction/Operations per
second
120 MIPS can be slower than 100
MIPS
Some instructions do more work than
others.
Ex. Loading a 32bit constant on the 68k
takes 2 instructions; just one on the
SHARC

Applications Not Great


Commonly used for non-DSP oriented
processors. Ex. SPEC, BYTEmark
VERY compiler dependent. May or
may not use available fancy DSP
features
Even if implemented in assembly, its
still a test of the programmers skill

Applications Contd
May measure the evaluation kit that
came with the DSP processor; not
just the DSP processor.

Which benchmark to
choose?
Architecture independence
Should perfectly reflect what the DSP
chip will be used for
Should blend in factors such as cost,
power usage, in the proportions you
care about
***It probably doesnt exist***

Trust the professionals


Famous last words: Lets trust the industry.
Whats available:
Vendor benchmarks
EEMBC (Embedded Microprocessor Benchmark
Consortium)
BDTI

TI and Analog Devices quote


benchmarks

Why do they use BDTImark?

What does BDTImark test?

Benchmark results:

Application profiling:

Conclusions
Benchmarks should be repeatable,
relevant, fair, and readily-comparable
Benchmarks test an application that
probably isnt the same as yours
Take benchmarks with a grain of salt

Questions?