Professional Documents
Culture Documents
Performance Evaluation in Database Research
Performance Evaluation in Database Research
Manegold (CWI)
1/144
Performance evaluation
Disclaimer
There is no single way how to do it right.
There are many ways how to do it wrong.
This is not a mandatory script.
This is more a collection of anecdotes or fairy tales not
always to be taken literally, only, but all provide some general
rules or guidelines what (not) to do.
Manegold (CWI)
2/144
Presentation
Repeatability
Summary
Manegold (CWI)
3/144
Presentation
Repeatability
Summary
Manegold (CWI)
4/144
How to compare?
CSI: How to find out what is going on?
Manegold (CWI)
5/144
Micro-benchmarks
Standard benchmarks
Real-life applications
Manegold (CWI)
6/144
Micro-benchmarks
Definition
Specialized, stand-alone piece of software
Isolating one particular piece of a larger system
E.g., single DB operator (select, join, aggregation, etc.)
Manegold (CWI)
7/144
Micro-benchmarks
Pros
Focused on problem at hand
Controllable workload and data characteristics
Data sets (synthetic & real)
Data size / volume (scalability)
Value ranges and distribution
Correlation
Queries
Workload size (scalability)
Manegold (CWI)
8/144
Micro-benchmarks
Cons
Neglect larger picture
Neglect contribution of local costs to global/total costs
Neglect impact of micro-benchmark on real-life applications
Neglect embedding in context/system at large
Generalization of result difficult
Application of insights in full systems / real-life applications
not obvious
Metrics not standardized
Comparison?
Manegold (CWI)
9/144
Standard benchmarks
Examples
RDBMS, OODBMS, ORDMBS:
TPC-{A,B,C,H,R,DS}, OO7, ...
XML, XPath, XQuery, XUF, SQL/XML:
MBench, XBench, XMach-1, XMark, X007, TPoX, ...
Stream Processing:
Linear Road, ...
General Computing:
SPEC, ...
...
Manegold (CWI)
10/144
Standard benchmarks
Pros
Mimic real-life scenarios
Publicly available
Well defined (in theory ...)
Scalable data sets and workloads (if well designed ...)
Metrics well defined (if well designed ...)
Easily comparable (?)
Manegold (CWI)
11/144
Standard benchmarks
Cons
Often outdated (standardization takes (too?) long)
Often compromises
Often very large and complicated to run
Limited dataset variation
Limited workload variation
Systems are often optimized for the benchmark(s), only!
Manegold (CWI)
12/144
Real-life applications
Pros
There are so many of them
Existing problems and challenges
Manegold (CWI)
13/144
Real-life applications
Cons
There are so many of them
Proprietary datasets and workloads
Manegold (CWI)
14/144
Analysis: CSI
Investigate (all?) details
Analyze and understand behavior and characteristics
Find out where the time goes and why!
Publication
Sell your story
Describe picture at large
Highlight (some) important / interesting details
Compare to others
Manegold (CWI)
15/144
Manegold (CWI)
16/144
17/144
Manegold (CWI)
18/144
Basic
Throughput: queries per time
Evaluation time
wall-clock (real)
CPU (user)
I/O (system)
Server-side vs. client-side
Comparison
Scale-up
Speed-up
Analysis
System events & interrupts
Hardware events
Manegold (CWI)
19/144
Q
1
16
server
3rd
user
real
3rd
real
client
4th
real
2830
550
3534
707
3575
1468
Manegold (CWI)
3533
618
run
... time (milliseconds)
20/144
Q
1
16
server
3rd
user
real
3rd
real
client
4th
real
2830
550
3534
707
3575
1468
Manegold (CWI)
3533
618
run
... time (milliseconds)
21/144
Q
1
16
server
3rd
user
real
file
file
2830 3533
550
618
Manegold (CWI)
3rd
real
file
3534
707
client
4th
real
terminal
3575
1468
result
size
1.3 KB
1.2 MB
run
... time (milliseconds)
output went to ...
22/144
Q
1
16
server
3rd
user
real
file
file
2830 3533
550
618
3rd
real
file
3534
707
client
4th
real
terminal
3575
1468
result
size
1.3 KB
1.2 MB
run
... time (milliseconds)
output went to ...
23/144
gettimeofday()
System function requires source code
Reports timestamp (microseconds)
Manegold (CWI)
24/144
QueryPerformanceCounter() /
QueryPerformanceFrequency()
System function requires source code
Reports timestamp (ticks per seconds)
Resolution can be as fine as 1 microsecond
cf., http://support.microsoft.com/kb/172338
Manegold (CWI)
25/144
Microsoft SQLserver
GUI and system variables
PostgreSQL
postgresql.conf
log statement stats = on
log min duration statement = 0
log duration = on
MonetDB/XQuery & MonetDB/SQL
mclient -lxquery -t
mclient -lsql -t
(PROFILE|TRACE) select ...
Manegold (CWI)
26/144
27/144
Manegold (CWI)
28/144
Manegold (CWI)
29/144
30/144
cold
Q
1
hot
time (milliseconds)
2930
Manegold (CWI)
2830
31/144
Q
1
cold
user
2930
Manegold (CWI)
hot
user
2830
32/144
Q
1
cold
user
real
2930 13243
Manegold (CWI)
hot
user
real
2830 3534
33/144
Q
1
cold
user
real
2930 13243
hot
user
real
2830 3534
Manegold (CWI)
34/144
Manegold (CWI)
35/144
Manegold (CWI)
36/144
Manegold (CWI)
37/144
Manegold (CWI)
38/144
10
13
16
TPC-H queries
19
22
39/144
Manegold (CWI)
40/144
Manegold (CWI)
41/144
Manegold (CWI)
42/144
Manegold (CWI)
43/144
44/144
Memory
CPU
200
150
100
50
year
system
CPU type
CPU speed
Manegold (CWI)
1992
Sun LX
Sparc
50 MHz
1996
Sun Ultra
UltraSparc
200 MHz
1997
SunUltra
UltraSparcII
296 MHz
1998
DEC Alpha
Alpha
500 MHz
2000
Origin2000
R12000
300 MHz
45/144
Manegold (CWI)
46/144
Manegold (CWI)
47/144
Manegold (CWI)
48/144
49/144
Memory
CPU
200
150
100
50
year
system
CPU type
CPU speed
Manegold (CWI)
1992
Sun LX
Sparc
50 MHz
1996
Sun Ultra
UltraSparc
200 MHz
1997
SunUltra
UltraSparcII
296 MHz
1998
DEC Alpha
Alpha
500 MHz
2000
Origin2000
R12000
300 MHz
50/144
Microsoft SQLserver
GUI and system variables
MySQL, PostgreSQL
EXPLAIN select ...
MonetDB/SQL
(PLAN|EXPLAIN|TRACE) select ...
Manegold (CWI)
51/144
System monitors
ps, top, iostat, ...
Manegold (CWI)
52/144
MonetDB/MIL trace
Manegold (CWI)
53/144
Presentation
Guidelines
Mistakes
Repeatability
Summary
Manegold (CWI)
Guidelines Mistakes
54/144
Guidelines Mistakes
Manegold (CWI)
55/144
Guidelines Mistakes
Manegold (CWI)
56/144
Guidelines Mistakes
57/144
Guidelines Mistakes
Manegold (CWI)
58/144
Guidelines Mistakes
Manegold (CWI)
59/144
Guidelines Mistakes
Manegold (CWI)
60/144
Guidelines Mistakes
Response time
Response time
Response time
A
B
C
C
Number of users
Manegold (CWI)
Number of users
Number of users
61/144
Guidelines Mistakes
Manegold (CWI)
62/144
Guidelines Mistakes
63/144
Guidelines Mistakes
Manegold (CWI)
64/144
Guidelines Mistakes
Manegold (CWI)
65/144
Guidelines Mistakes
Availability
Unavailability
Manegold (CWI)
2 3 4 5
Day of the week
0.1
2 3 4 5
Day of the week
66/144
Guidelines Mistakes
Reading material
Manegold (CWI)
67/144
Guidelines Mistakes
Manegold (CWI)
68/144
Guidelines Mistakes
Response time
Utilization Throughput
40
Throughput
30
20
10
Utilization
Response time
100
20
75
15
50
10
25
Number of users
Manegold (CWI)
69/144
Guidelines Mistakes
Response time
Utilization Throughput
40
Throughput
30
20
10
Utilization
Response time
100
20
75
15
50
10
25
Number of users
Huh?
Manegold (CWI)
70/144
Guidelines Mistakes
Manegold (CWI)
71/144
Guidelines Mistakes
=3
Manegold (CWI)
=2
1 job/sec
Response time
=1
2 jobs/sec
3 jobs/sec
Arrival rate
72/144
Guidelines Mistakes
=2
1 job/sec
Response time
=1
=3
2 jobs/sec
3 jobs/sec
Arrival rate
Manegold (CWI)
73/144
Guidelines Mistakes
=3
=2
1 job/sec
Response time
=1
2 jobs/sec
3 jobs/sec
Arrival rate
Manegold (CWI)
74/144
Guidelines Mistakes
Manegold (CWI)
75/144
Guidelines Mistakes
Manegold (CWI)
76/144
Guidelines Mistakes
77/144
Guidelines Mistakes
Pictorial games
Manegold (CWI)
78/144
Guidelines Mistakes
Pictorial games
2610
5200
MINE
YOURS
MINE
2600
2600
Manegold (CWI)
YOURS
79/144
Guidelines Mistakes
Pictorial games
2610
5200
MINE
YOURS
MINE
2600
2600
YOURS
A-ha
Manegold (CWI)
80/144
Guidelines Mistakes
Pictorial games
2600
MINE
YOURS
Manegold (CWI)
81/144
Guidelines Mistakes
Pictorial games
Plot random quantities without confidence intervals
MINE
YOURS
Manegold (CWI)
MINE
YOURS
82/144
Guidelines Mistakes
Pictorial games
Plot random quantities without confidence intervals
MINE
YOURS
MINE
YOURS
Manegold (CWI)
83/144
Guidelines Mistakes
Pictorial games
12
18
10
15
12
Frequency
Frequency
6
4
2
9
6
3
[0,2)[2,4)[4,6)[6,8)[8,10)[10,12)
Response time
Manegold (CWI)
[0,6)[6,12)
Response time
84/144
Guidelines Mistakes
Pictorial games
12
18
10
15
12
Frequency
Frequency
6
4
2
9
6
3
[0,2)[2,4)[4,6)[6,8)[8,10)[10,12)
Response time
[0,6)[6,12)
Response time
Manegold (CWI)
85/144
Guidelines Mistakes
2.2
1.8
1.6
1.4
1.2
2.2
2
1.8
1.6
1.4
1.2
1
1
1
1
Manegold (CWI)
10
13
TPC-H queries
16
19
22
10
13
16
TPC-H queries
19
22
86/144
Guidelines Mistakes
better:
set size ratio 0 0.5,0.5
relative execution time: DBG/OPT
2.2
1.8
1.6
1.4
1.2
2.2
2
1.8
1.6
1.4
1.2
1
1
1
1
Manegold (CWI)
10
13
TPC-H queries
16
19
22
10
13
16
TPC-H queries
19
22
87/144
Guidelines Mistakes
better:
set size ratio 0 0.5,0.5
relative execution time: DBG/OPT
2.2
1.8
1.6
1.4
1.2
2.2
2
1.8
1.6
1.4
1.2
1
1
1
1
10
13
TPC-H queries
16
19
22
10
13
16
TPC-H queries
19
22
88/144
Guidelines Mistakes
Manegold (CWI)
89/144
Guidelines Mistakes
3400x
Manegold (CWI)
90/144
Guidelines Mistakes
Manegold (CWI)
91/144
Guidelines Mistakes
cat /proc/cpuinfo
processor
vendor_id
cpu family
model
model name
stepping
cpu MHz
cache size
fdiv_bug
hlt_bug
f00f_bug
coma_bug
fpu
fpu_exception
cpuid level
wp
flags
bogomips
clflush size
Manegold (CWI)
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
0
GenuineIntel
6
13
Intel(R) Pentium(R) M processor 1.50GHz
6
600.000
2048 KB
no
no
no
no
yes
yes
2
yes
fpu vme de pse tsc msr mce cx8 mtrr pge mca cmov pat clflush
dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2
: 1196.56
: 64
92/144
Guidelines Mistakes
cat /proc/cpuinfo
processor
vendor_id
cpu family
model
model name
stepping
cpu MHz
cache size
fdiv_bug
hlt_bug
f00f_bug
coma_bug
fpu
fpu_exception
cpuid level
wp
flags
bogomips
clflush size
Manegold (CWI)
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
0
GenuineIntel
6
13
Intel(R) Pentium(R) M processor 1.50GHz !
6
600.000 = throtteled down by speed stepping!
2048 KB
no
no
no
no
yes
yes
2
yes
fpu vme de pse tsc msr mce cx8 mtrr pge mca cmov pat clflush
dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2
: 1196.56
: 64
93/144
Guidelines Mistakes
/sbin/lspci -v | wc
151 lines
861 words
6663 characters
Manegold (CWI)
94/144
Guidelines Mistakes
/sbin/lspci -v | wc
151 lines
861 words
6663 characters
Over-specified!
Manegold (CWI)
95/144
Guidelines Mistakes
Manegold (CWI)
96/144
Guidelines Mistakes
Manegold (CWI)
97/144
Presentation
Repeatability
Portable parameterizable experiments
Test suite
Documenting your experiment suite
Summary
Manegold (CWI)
98/144
Manegold (CWI)
99/144
Manegold (CWI)
100/144
Writing instructions
Manegold (CWI)
101/144
Manegold (CWI)
102/144
Manegold (CWI)
103/144
Manegold (CWI)
104/144
Manegold (CWI)
105/144
Manegold (CWI)
106/144
Manegold (CWI)
107/144
Manegold (CWI)
108/144
109/144
Abstract (Take 1)
We provide a new algorithm that consistently outperforms the state
of the art.
Manegold (CWI)
110/144
Abstract (Take 1)
We provide a new algorithm that consistently outperforms the state
of the art.
Abstract (Take 2)
We provide a new algorithm that on a Debian Linux machine with
4 GHz CPU, 60 GB disk, DMA, 2 GB main memory and our own
brand of system libraries consistently outperforms the state of the
art.
Manegold (CWI)
111/144
Abstract (Take 1)
We provide a new algorithm that consistently outperforms the state
of the art.
Abstract (Take 2)
We provide a new algorithm that on a Debian Linux machine with
4 GHz CPU, 60 GB disk, DMA, 2 GB main memory and our own
brand of system libraries consistently outperforms the state of the
art.
There are obvious, undisputed exceptions
Manegold (CWI)
112/144
This is
huge
Manegold (CWI)
113/144
This is
huge
Manegold (CWI)
114/144
Purpose: have a very simple mean to obtain a test for the values
f1 = v1 , f2 = v2 , . . . , fk = vk
Manegold (CWI)
115/144
Purpose: have a very simple mean to obtain a test for the values
f1 = v1 , f2 = v2 , . . . , fk = vk
Many tricks. Very simple ones:
argc / argv: specific to each class main
Configuration files
Java Properties pattern
+ command-line arguments
Manegold (CWI)
116/144
Configuration files
Omnipresent in large-scale software
Crucial if you hope for serious installations: see gnu software
install procedure
Decide on a specific relative directory, fix the syntax
Report meaningful error if the configuration file is not found
Pro: human-readable even without running code
Con: the values are read when the process is created
Manegold (CWI)
117/144
Manegold (CWI)
118/144
Using java.util.Properties
One possible usage
class Parameters{
Properties prop;
String[][] defaults = {{dataDir, ./data},
{doStore, true} };
void init(){
prop = new Properties();
for (int i = 0; i < defaults.length; i ++)
prop.put(defaults[i][0], defaults[i][1]);
}
void set(String s, String v){ prop.put(s, v); }
String get(String s){
// error if prop is null!
return prop.get(s);}
}
Manegold (CWI)
119/144
Using java.util.Properties
Manegold (CWI)
120/144
Manegold (CWI)
121/144
Manegold (CWI)
122/144
Manegold (CWI)
123/144
Manegold (CWI)
124/144
in
so
125/144
You have:
files containing numbers characterizing the parameter values
and the results
basic shell skills
Manegold (CWI)
126/144
You have:
files containing numbers characterizing the parameter values
and the results
basic shell skills
You need: graphs
Most frequently used solutions:
Based on Gnuplot
Based on Excel or OpenOffice clone
Other solutions: R; Matlab (remember portability)
Manegold (CWI)
127/144
Manegold (CWI)
128/144
Manegold (CWI)
129/144
Manegold (CWI)
130/144
Manegold (CWI)
131/144
Manegold (CWI)
B
Scale factor
...
...
C
Execution time
...
...
132/144
B
Scale factor
...
...
C
Execution time
...
...
Manegold (CWI)
133/144
B
Scale factor
...
...
C
Execution time
...
...
Create in the .xls file a graph out of the cells A1:B3, chose the
layout, colors etc.
Manegold (CWI)
134/144
B
Scale factor
...
...
C
Execution time
...
...
Create in the .xls file a graph out of the cells A1:B3, chose the
layout, colors etc.
Manegold (CWI)
135/144
Graph generation
Manegold (CWI)
136/144
Graph generation
Manegold (CWI)
137/144
Manegold (CWI)
b
13.666
15
12.3333
13
138/144
b
13.666
15
12.3333
13
Manegold (CWI)
b
13666
15
123333
13
139/144
b
13.666
15
12.3333
13
b
13666
15
123333
13
Manegold (CWI)
140/144
b
13.666
15
12.3333
13
b
13666
15
123333
13
141/144
Manegold (CWI)
142/144
Manegold (CWI)
143/144
Manegold (CWI)
144/144