Professional Documents
Culture Documents
Chapter 2
Performance Concepts
components
Reduce the frequency of
• Architectural examples memory access by
incorporating
include: increasingly complex and
efficient cache structures
between the processor
and main memory
• Power
– Power density increases with density of logic and clock speed
– Dissipating heat
• RC delay
– Speed at which electrons flow limited by resistance and capacitance of metal
wires connecting them
– Delay increases as the RC product increases
– As components on the chip decrease in size, the wire interconnects become
thinner, increasing resistance
– Also, the wires are closer together, increasing capacitance
Multicore
the same chip provides the
potential to increase performance
without increasing the clock rate
Ic p m k
Compiler technology X X X
Processor implementation X X
The three
The use of benchmarks to compare common
systems involves calculating the formulas used
mean value of a set of data points
related to execution time for calculating a
mean are:
• Arithmetic
• Geometric
• Harmonic
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Figure 2.6
The AM used for a time-based variable, such as program execution time, has the
important property that it is directly proportional to the total time
If the total time doubles, the mean value doubles
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Table 2.2
A Comparison of Arithmetic and Harmonic Means for Rates
Computer Computer Computer Computer Computer Computer
A time B time C time A rate B rate C rate
(secs) (secs) (secs) (MFLOPS) (MFLOPS) (MFLOPS)
Program 1
2.0 1.0 0.75 50 100 133.33
(108 FP ops)
Program 1
0.75 2.0 4.0 133.33 50 25
(108 FP ops)
Total
execution 2.75 3.0 4.75 – – –
time
Arithmetic
mean of 1.38 1.5 2.38 – – –
times
Inverse
of total
0.36 0.33 0.21 – – –
execution
time (1/sec)
Arithmetic
mean of – – – 91.67 75.00 79.17
rates
Harmonic
mean of – – – 72.72 66.67 42.11
rates
– SPEC
– An industry consortium
– Defines and maintains the best known collection of benchmark suites
aimed at evaluating computer systems
– Performance measurements are widely used for comparison and
research purposes
Kloc = line count (including comments/whitespace) for source files used in a build/1000 (Table can be found on page 61 in the textbook.)
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Rate Speed Language Kloc Application Area
Kloc = line count (including comments/whitespace) for source files used in a build/1000 (Table can be found on page 61 in the textbook.)
Copyright © 2022 Pearson Education, Ltd. All Rights Reserved
Base Peak
541.leela_r
892 1410 896 1420 (a) Rate Result
833 2420 770 2610 (768 copies)
548.exchange2_r
602.gcc_s
546 7.29 535 7.45 SPEC
605.mcf_s
866 5.45 700 6.75 CPU 2017
276 5.90 247 6.61 Integer
620.omnetpp_s
Benchmarks
188 7.52 179 7.91 for HP
623.xalancbmk_s
Integrity
625.x264_s
283 6.23 271 6.51 Superdome X
407 3.52 343 4.18
631.deepsjeng_s
(b) Speed
641.leela_s
469 3.63 439 3.88
Result
329 8.93 299 9.82
(384 threads)
648.exchange2_s
605.mcf_s
4721 5150 1090 1120
Table 2.7
1630 1770 1090 1090
620.omnetpp_s SPECspeed
2017_int_base
1417 1540 1090 1090
Benchmark
623.xalancbmk_s Results for
Reference
1764 1920 1090 1100
625.x264_s Machine (1
1432 1560 1090 1130 thread)
631.deepsjeng_s