You are on page 1of 23

Morgan Kaufmann Publishers August 20, 2021

Defining Performance
 Which airplane has the best performance?

Boeing 777 Boeing 777

Boeing 747 Boeing 747

BAC/Sud BAC/Sud
Concorde Concorde
Douglas Douglas DC-
DC-8-50 8-50

0 100 200 300 400 500 0 2000 4000 6000 8000 10000

Passenger Capacit y Cruising Range (miles)

Boeing 777 Boeing 777

Boeing 747 Boeing 747

BAC/Sud BAC/Sud
Concorde Concorde
Douglas Douglas DC-
DC-8-50 8-50

0 500 1000 1500 0 100000 200000 300000 400000

Cruising Speed (mph) Passengers x mph

Basic Metrics

Chapter 1 — Computer Abstractions and Technology 1


Morgan Kaufmann Publishers August 20, 2021

Response Time and Throughput

• throughput: work per unit time


– = (1 / latency) when there is NO OVERLAP
10 time units
– > (1 / latency) when there is overlap Finish
each
• in real processors there is always overlap time unit

– good metric for fixed amount of time (maximize work)

Throughput vs. Latency


• What is better?
– A machine that always takes 1 ns to do “task X” 1 time
– A machine that takes 15 ns to do “task X” 30 times...
• ...but 5 ns to do “task X” 1 time

– Machine 1:
• a lower latency for a single operation...
– Machine 2:
• better throughput for multiple operations

– What’s better?
• depends on what kind of computation you need to do

Chapter 1 — Computer Abstractions and Technology 2


Morgan Kaufmann Publishers August 20, 2021

Where latency matters

Ratios of Measurements

Chapter 1 — Computer Abstractions and Technology 3


Morgan Kaufmann Publishers August 20, 2021

Relative Performance
 Define Performance = 1/Execution Time
 “X is n time faster than Y”
Performanc e X Performanc e Y
 Execution time Y Execution time X  n

 Example: time taken to run a program


 10s on A, 15s on B
 Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
 So A is 1.5 times faster than B

Example: Speedup

Chapter 1 — Computer Abstractions and Technology 4


Morgan Kaufmann Publishers August 20, 2021

Speedup and % Increase and Decrease

Mean (Average) Performance Numbers

10

Chapter 1 — Computer Abstractions and Technology 5


Morgan Kaufmann Publishers August 20, 2021

For Example…

11

Average execution time


Computer X (s) Computer Y (s) Speedup (X to Y)
App A 9 18 2
App B 10 7 0.7 1.63
App C 5 11 2.2
Ave. execution 8 12 1.5
time
7.66 11.15 1.456

12

Chapter 1 — Computer Abstractions and Technology 6


Morgan Kaufmann Publishers August 20, 2021

Mean (Average) Performance


Numbers

13

Derived metrics
 Often we care about multiple metrics at once.
 Examples (Bigger is better)
 Bandwidth per dollar (e.g., in networking (GB/s)/$)
 BW/Watt (e.g., in memory systems (GB/s)/W)
 Work/Joule (e.g., instructions/joule)
 In general: Multiply by big-is-better metrics, divide by
smaller-is-better
 Examples (Smaller is better)
 Cycles/Instruction (i.e., Time per work)
 Latency * Energy --“Energy Delay Product”
 In general: Multiply by smaller-is-better metrics, divide
by bigger-is-better
14

Chapter 1 — Computer Abstractions and Technology 7


Morgan Kaufmann Publishers August 20, 2021

Example: Energy-Delay

15

Example: Energy-Delay

16

Chapter 1 — Computer Abstractions and Technology 8


Morgan Kaufmann Publishers August 20, 2021

What’s the Right Metric?

17

Measuring Execution Time


 Elapsed time
 Total response time, including all aspects
 Processing, I/O, OS overhead, idle time
 Determines system performance
 CPU time
 Time spent processing a given job
 Discounts I/O time, other jobs’ shares
 Comprises user CPU time and system CPU
time
 Different programs are affected differently by
CPU and system performance

Chapter 1 — Computer Abstractions and Technology 9


Morgan Kaufmann Publishers August 20, 2021

CPU Clocking
 Operation of digital hardware governed by a
constant-rate clock
Clock period

Clock (cycles)

Data transfer
and computation
Update state

 Clock period: duration of a clock cycle


 e.g., 250ps = 0.25ns = 250×10–12s
 Clock frequency (rate): cycles per second
 e.g., 4.0GHz = 4000MHz = 4.0×109Hz

Chapter 1 — Computer Abstractions and Technology 10


Morgan Kaufmann Publishers August 20, 2021

CPU Time
CPU Time  CPU Clock Cycles  Clock Cycle Time
CPU Clock Cycles

Clock Rate

 Performance improved by
 Reducing number of clock cycles (cycle count)
 Increasing clock rate
 Hardware designer must often trade off clock
rate against cycle count

CPU Time Example


 Computer A: 2GHz clock, 10s CPU time
 Designing Computer B
 Aim for 6s CPU time
 Can do faster clock, but causes 1.2 × clock cycles
 How fast must Computer B clock be?
Clock CyclesB 1.2  Clock Cycles A
Clock RateB  
CPU Time B 6s
Clock Cycles A  CPU Time A  Clock Rate A
 10s  2GHz  20  10 9
1.2  20  10 9 24  10 9
Clock RateB    4GHz
6s 6s

Chapter 1 — Computer Abstractions and Technology 11


Morgan Kaufmann Publishers August 20, 2021

Instruction Count and CPI


Clock Cycles  Instructio n Count  Cycles per Instructio n
CPU Time  Instructio n Count  CPI  Clock Cycle Time
Instructio n Count  CPI

Clock Rate
 Instruction Count for a program
 Determined by program, ISA and compiler
 Average cycles per instruction (CPI)
 Determined by CPU hardware (e.g. multi-issue)
 If different instructions have different CPI
 Average CPI affected by instruction mix

CPI Example
 Computer A: Cycle Time = 250ps, CPI = 2.0
 Computer B: Cycle Time = 500ps, CPI = 1.2
 Same ISA
 Which is faster, and by how much?
CPU Time  Instruction Count  CPI  Cycle Time
A A A
 I  2.0  250ps  I  500ps A is faster…
CPU Time  Instruction Count  CPI  Cycle Time
B B B
 I  1.2  500ps  I  600ps
CPU Time
B  I  600ps  1.2
…by this much
CPU Time I  500ps
A

Chapter 1 — Computer Abstractions and Technology 12


Morgan Kaufmann Publishers August 20, 2021

CPI in More Detail


 If different instruction classes take different
numbers of cycles
n
Clock Cycles   (CPIi  Instruction Count i )
i1

 Weighted average CPI


Clock Cycles n
 Instruction Count i 
CPI     CPIi  
Instruction Count i1  Instruction Count 

Relative frequency

CPI Example
 Alternative compiled code sequences using
instructions in classes A, B, C
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1

 Sequence 1: IC = 5  Sequence 2: IC = 6
 Clock Cycles  Clock Cycles
= 2 ×1 + 1 ×2 + 2 ×3 = 4×1 + 1×2 + 1×3
= 10 =9
 Avg. CPI = 10/5 = 2.0  Avg. CPI = 9/6 = 1.5

Chapter 1 — Computer Abstractions and Technology 13


Morgan Kaufmann Publishers August 20, 2021

CPI Example

27

Measuring CPI

28

Chapter 1 — Computer Abstractions and Technology 14


Morgan Kaufmann Publishers August 20, 2021

The Iron Law of Performance

The Iron Law of Performance

Chapter 1 — Computer Abstractions and Technology 15


Morgan Kaufmann Publishers August 20, 2021

Reducing Cycle Time

31

Example 1: Reduce the IC

32

Chapter 1 — Computer Abstractions and Technology 16


Morgan Kaufmann Publishers August 20, 2021

Other Impacts on Instruction Count

33

Cycles Per Instruction

34

Chapter 1 — Computer Abstractions and Technology 17


Morgan Kaufmann Publishers August 20, 2021

Example: Reducing CPI

35

Example: Reducing CPI

36

Chapter 1 — Computer Abstractions and Technology 18


Morgan Kaufmann Publishers August 20, 2021

Performance Summary
 Performance depends on
 Algorithm: affects IC, possibly CPI
 More multiplications, higher CPI
 Programming language: affects IC, CPI
 Statements in the language are translated into
processor instructions
 Compiler: affects IC, CPI
 Compiler selects the instructions
 Use registers to eliminate loads and stores
 Instruction set architecture: affects IC, CPI,
clock rate

Pitfall: Amdahl’s Law


• Qualifies performance gain

• Amdahl’s Law defined…


– The performance improvement to be gained from using
some faster mode of execution is limited by the amount of
time the enhancement is actually used.

• Amdahl’s Law defines speedup:


Execution time for entire task without enhancement
Speedup =
Execution time for entire task using enhancement
when possible

Chapter 1 — Computer Abstractions and Technology 19


Morgan Kaufmann Publishers August 20, 2021

Amdahl’s Law in Action

39

Amdahl’s Law in Action

40

Chapter 1 — Computer Abstractions and Technology 20


Morgan Kaufmann Publishers August 20, 2021

Amdahl’s Law

41

End here

42

Chapter 1 — Computer Abstractions and Technology 21


Morgan Kaufmann Publishers August 20, 2021

Amdahl’s Law

43

Amdahl’s Law

44

Chapter 1 — Computer Abstractions and Technology 22


Morgan Kaufmann Publishers August 20, 2021

Amdahl’s Law

45

Amdahl’s Corollary

46

Chapter 1 — Computer Abstractions and Technology 23

You might also like