# Measuring Performance

Chris Clack B261 Systems Architecture

1

Outline • • • • • Why measure performance? Characterising performance Performance and speed Basic terminology for measuring Performance and Execution Time 2 .

3 .Why Measure Performance? SO THAT YOU KNOW WHAT YOU’RE SUPPOSED TO BE GETTING! So that you can compare systems! So that you can know if the machine can do its tasks.

Characterising Performance • Not always obvious how to characterise performance: • • • • motor cars football teams tennis players computers? • Can lead to serious errors • improve the processor to improve speed? 4 .

Performance and Speed • Performance for a program on a particular machine • Pr r a c e fo m n Pr r a c e fo mn Pr r a c e fo mn e( X ) = 1 Ee u n x c tio e( X ) Ee u n x c tio = e(Y ) Ee u n x c tio ( X) (Y ) =n (X) • X is n times faster than Y 5 .

Measuring Time • • Execution time is the amount of time it takes the program to execute in seconds. • CPU time is time spent executing this program • excluding waiting for I/O. or other programs 6 . Time (computers do several tasks!) • elapsed time based on a normal clock.

0% of elapsed time 7 .099u 8.Execution Time Elapsed Time (real time) CPU time for this program I/O waiting & other programs Example (UNIX) 11.67 3.0% user (direct) system (time in OS) (user) (seconds) (system) (seconds) (elapsed) (min:secs) CPU time = 3.659s 10:43.

1 decade ago my home computer had 32K memory. Today it has 32MB. and needs to be upgraded to at least 64MB! 1 bit 8 bits = 1 byte 1024bytes = 1 kilobyte = 1KByte = 1K = 210 1024KBytes = 1 megabyte = 1MB = 220 1024MB = 1 gigabyte = 1GB = 230 1024GB = 1 terrabyte = 240 and on to infinity 8 .Measuring Amounts • • • • • • • Aside .

000.000. 9 .000.000s = 1 nanosec = 10-9 s • Frequency • 1 Herz = 1 cycle per second • 1 MHz = 1.000 s = 1 microsec = 10-6 s 1/1.000.Measuring Times • Duration • • • • 1 second 1/1000 second = 1 millisec = 1ms = 10-3 s 1/1.000 cycles per sec.000 cycles per sec • 100MHz = 100.000.

The clock rate is the reciprocal of clock cycle . 10 . • 10 ns = 1/100. same as:• 1/10ns = 100. 10ns).000. 100MHz).000 (clock cycle).000 = 100MHz (clock rate).a frequency. how many cycles per sec (eg.000.Computer Clock Times • • • Computers run according to a clock that runs at a steady rate The time interval is called a clock cycle (eg.

let’s get it right…. 11 . right? •WRONG! • Now. B is faster..Purchasing Decision • • • Computer A has a 100MHz processor Computer B has a 300MHz processor So.

and • number of cycles required for a program 12 .Measuring Performance • • The only important question: “HOW FAST WILL MY PROGRAM RUN?” CPU execution time for a program • = CPU clock cycles * cycle time • (= CPU clock cycles/Clock rate) • In computer design. trade-off between: • clock cycle time.

Cycles Per Instruction • The execution time of a program clearly must depend on the number of instructions • but different instructions take different times • An expression that includes this is:• CPU clock cycles = N * CPI • N = number of instructions • CPI = average clock cycles per instruction 13 .

7 0 5 ( A) e( A ) Ee ui n xc to = e( B) Ee ui n xc to 14 .Example • Machine A • clock cycle time • • Machine B • clock cycle time • 10ns/cycle 30ns/cycle • CPI = 2.5 * 30 = I * 15 ns ( B) = . CPU clock cycles (A) = I * 2.5 for prog X Let I = number of instructions in the program.0 for prog X • CPI = 0.0 CPU time (A) = CPU clock cycles * clock cycle time = I * 2.5 CPU time (B) = CPU clock cycles * clock cycle time = I * 0.0 * 10 = I * 20 ns Promn ef r a c Promn ef r a c CPU clock cycles (B) = I * 0.

15 .Basic Performance Equation • CPU Time = I * CPI * T • I = number of instructions in program • CPI = average cycles per instruction • T = clock cycle time • CPU Time = I * CPI / R • R = 1/T the clock rate • T or R are usually published as performance measures for a processor • I requires special profiling software • CPI depends on many factors (including memory).

Other “tricks of the trade” • • • MIPS • Million Instructions Per Second MFLOPS • Million Floating Point Operations Per Second Benchmarks: SPECs • Average Performance over a set of example programs • Are any of these accurate? or even useful? 16 .

Marketing Metrics (Patterson) MIPS = Instruction Count / Time * 10^6 = Clock Rate / CPI * 10^6 •machines with different instruction sets ? •programs with different instruction mixes ? • dynamic frequency of instructions • uncorrelated with performance MFLOP/S •machine dependent = FP Operations / Time * 10^6 •often not where time is spent .

Amdahl's Law Speedup due to enhancement E: ExTime w/o E Speedup(E) = -------------------ExTime w/ E = Performance w/ E --------------------Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S and the remainder of the task is unaffected then. ExTime(with E) = ((1-F) + F/S) X ExTime(without E) Speedup(with E) = ExTime(without E) ÷ ((1-F) + F/S) X ExTime(without E) .

Summary • • Dealt with issues of measuring performance Considered terminology for measuring performance • • • • • bits bytes megabytes nanoseconds MHz • Seen the basic formulae for execution time. 19 .