Professional Documents
Culture Documents
Defining Performance: Which Airplane Has The Best Performance?
Defining Performance: Which Airplane Has The Best Performance?
Defining Performance
Which airplane has the best performance?
BAC/Sud BAC/Sud
Concorde Concorde
Douglas Douglas DC-
DC-8-50 8-50
0 100 200 300 400 500 0 2000 4000 6000 8000 10000
BAC/Sud BAC/Sud
Concorde Concorde
Douglas Douglas DC-
DC-8-50 8-50
Basic Metrics
– Machine 1:
• a lower latency for a single operation...
– Machine 2:
• better throughput for multiple operations
– What’s better?
• depends on what kind of computation you need to do
Ratios of Measurements
Relative Performance
Define Performance = 1/Execution Time
“X is n time faster than Y”
Performanc e X Performanc e Y
Execution time Y Execution time X n
Example: Speedup
10
For Example…
11
12
13
Derived metrics
Often we care about multiple metrics at once.
Examples (Bigger is better)
Bandwidth per dollar (e.g., in networking (GB/s)/$)
BW/Watt (e.g., in memory systems (GB/s)/W)
Work/Joule (e.g., instructions/joule)
In general: Multiply by big-is-better metrics, divide by
smaller-is-better
Examples (Smaller is better)
Cycles/Instruction (i.e., Time per work)
Latency * Energy --“Energy Delay Product”
In general: Multiply by smaller-is-better metrics, divide
by bigger-is-better
14
Example: Energy-Delay
15
Example: Energy-Delay
16
17
CPU Clocking
Operation of digital hardware governed by a
constant-rate clock
Clock period
Clock (cycles)
Data transfer
and computation
Update state
CPU Time
CPU Time CPU Clock Cycles Clock Cycle Time
CPU Clock Cycles
Clock Rate
Performance improved by
Reducing number of clock cycles (cycle count)
Increasing clock rate
Hardware designer must often trade off clock
rate against cycle count
CPI Example
Computer A: Cycle Time = 250ps, CPI = 2.0
Computer B: Cycle Time = 500ps, CPI = 1.2
Same ISA
Which is faster, and by how much?
CPU Time Instruction Count CPI Cycle Time
A A A
I 2.0 250ps I 500ps A is faster…
CPU Time Instruction Count CPI Cycle Time
B B B
I 1.2 500ps I 600ps
CPU Time
B I 600ps 1.2
…by this much
CPU Time I 500ps
A
Relative frequency
CPI Example
Alternative compiled code sequences using
instructions in classes A, B, C
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1
Sequence 1: IC = 5 Sequence 2: IC = 6
Clock Cycles Clock Cycles
= 2 ×1 + 1 ×2 + 2 ×3 = 4×1 + 1×2 + 1×3
= 10 =9
Avg. CPI = 10/5 = 2.0 Avg. CPI = 9/6 = 1.5
CPI Example
27
Measuring CPI
28
31
32
33
34
35
36
Performance Summary
Performance depends on
Algorithm: affects IC, possibly CPI
More multiplications, higher CPI
Programming language: affects IC, CPI
Statements in the language are translated into
processor instructions
Compiler: affects IC, CPI
Compiler selects the instructions
Use registers to eliminate loads and stores
Instruction set architecture: affects IC, CPI,
clock rate
39
40
Amdahl’s Law
41
End here
42
Amdahl’s Law
43
Amdahl’s Law
44
Amdahl’s Law
45
Amdahl’s Corollary
46