You are on page 1of 52

Copyright 2004 David J.

Lilja 1
Errors in Experimental
Measurements
Sources of errors
Accuracy, precision, resolution
A mathematical model of errors
Confidence intervals
For means
For proportions
How many measurements are needed for
desired error?

Copyright 2004 David J. Lilja 2
Why do we need statistics?
1. Noise, noise, noise, noise, noise!

OK not really this type of noise
Copyright 2004 David J. Lilja 3
Why do we need statistics?
2. Aggregate data into
meaningful
information.

445 446 397 226
388 3445 188 1002
47762 432 54 12
98 345 2245 8839
77492 472 565 999
1 34 882 545 4022
827 572 597 364
... = x
Copyright 2004 David J. Lilja 4
What is a statistic?
A quantity that is computed from a sample
[of data].
Merriam-Webster
A single number used to summarize a larger
collection of values.
Copyright 2004 David J. Lilja 5
What are statistics?
A branch of mathematics dealing with the
collection, analysis, interpretation, and
presentation of masses of numerical data.
Merriam-Webster
We are most interested in analysis and
interpretation here.
Lies, damn lies, and statistics!

Copyright 2004 David J. Lilja 6
Goals
Provide intuitive conceptual background for
some standard statistical tools.
Draw meaningful conclusions in presence of
noisy measurements.
Allow you to correctly and intelligently apply
techniques in new situations.
Dont simply plug and crank from a formula.
Copyright 2004 David J. Lilja 7
Goals
Present techniques for aggregating large
quantities of data.
Obtain a big-picture view of your results.
Obtain new insights from complex
measurement and simulation results.
E.g. How does a new feature impact the
overall system?
Copyright 2004 David J. Lilja 8
Sources of Experimental Errors
Accuracy, precision, resolution
Copyright 2004 David J. Lilja 9
Experimental errors
Errors noise in measured values
Systematic errors
Result of an experimental mistake
Typically produce constant or slowly varying bias
Controlled through skill of experimenter
Examples
Temperature change causes clock drift
Forget to clear cache before timing run

Copyright 2004 David J. Lilja 10
Experimental errors
Random errors
Unpredictable, non-deterministic
Unbiased equal probability of increasing or decreasing
measured value
Result of
Limitations of measuring tool
Observer reading output of tool
Random processes within system
Typically cannot be controlled
Use statistical tools to characterize and quantify

Copyright 2004 David J. Lilja 11
Example: Quantization
Random error
Copyright 2004 David J. Lilja 12
Quantization error
Timer resolution
quantization error
Repeated measurements
X
Completely unpredictable
Copyright 2004 David J. Lilja 13
A Model of Errors
Error Measured
value
Probability
-E x E
+E x + E
Copyright 2004 David J. Lilja 14
A Model of Errors
Error 1 Error 2 Measured
value
Probability
-E -E x 2E
-E
+E x


+E -E x
+E +E x + 2E
Copyright 2004 David J. Lilja 15
A Model of Errors
Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
x-E x x+E
Measured value
Copyright 2004 David J. Lilja 16
Probability of Obtaining a
Specific Measured Value
Copyright 2004 David J. Lilja 17
A Model of Errors
Pr(X=x
i
) = Pr(measure x
i
)
= number of paths from real value to x
i

Pr(X=x
i
) ~ binomial distribution
As number of error sources becomes large
n ,
Binomial Gaussian (Normal)
Thus, the bell curve
Copyright 2004 David J. Lilja 18
Frequency of Measuring Specific
Values
Mean of measured values
True value
Resolution
Precision
Accuracy
Copyright 2004 David J. Lilja 19
Accuracy, Precision,
Resolution
Systematic errors accuracy
How close mean of measured values is to true
value
Random errors precision
Repeatability of measurements
Characteristics of tools resolution
Smallest increment between measured values
Copyright 2004 David J. Lilja 20
Quantifying Accuracy,
Precision, Resolution
Accuracy
Hard to determine true accuracy
Relative to a predefined standard
E.g. definition of a second
Resolution
Dependent on tools
Precision
Quantify amount of imprecision using statistical
tools
Copyright 2004 David J. Lilja 21
Confidence Interval for the
Mean
c1 c2
1-
/2 /2
Copyright 2004 David J. Lilja 22
Normalize x
1
) (
deviation standard
mean
ts measuremen of number
/
n
1 i
2
1


= =
= =
=

=
=
n
x x
s
x x
n
n s
x x
z
i
n
i
i
Copyright 2004 David J. Lilja 23
Confidence Interval for the
Mean
Normalized z follows a Students t distribution
(n-1) degrees of freedom
Area left of c
2
= 1 /2
Tabulated values for t
c1 c2
1-
/2 /2
Copyright 2004 David J. Lilja 24
Confidence Interval for the
Mean
As n , normalized distribution becomes
Gaussian (normal)
c1 c2
1-
/2 /2
Copyright 2004 David J. Lilja 25
Confidence Interval for the
Mean
o
o
o
= s s
+ =
=


1 ) Pr(
Then,
2 1
1 ; 2 / 1 2
1 ; 2 / 1 1
c x c
n
s
t x c
n
s
t x c
n
n
Copyright 2004 David J. Lilja 26
An Example
Experiment Measured value
1 8.0 s
2 7.0 s
3 5.0 s
4 9.0 s
5 9.5 s
6 11.3 s
7 5.2 s
8 8.5 s
Copyright 2004 David J. Lilja 27
An Example (cont.)
14 . 2 deviation standard sample
94 . 7
1
= =
= =

=
s
n
x
x
n
i
i
Copyright 2004 David J. Lilja 28
An Example (cont.)
90% CI 90% chance actual value in interval
90% CI = 0.10
1 - /2 = 0.95
n = 8 7 degrees of freedom

c1 c2
1-
/2 /2
Copyright 2004 David J. Lilja 29
90% Confidence Interval
a
n 0.90 0.95 0.975

5 1.476 2.015 2.571
6 1.440 1.943 2.447
7 1.415 1.895 2.365

1.282 1.645 1.960
4 . 9
8
) 14 . 2 ( 895 . 1
94 . 7
5 . 6
8
) 14 . 2 ( 895 . 1
94 . 7
895 . 1
95 . 0 2 / 10 . 0 1 2 / 1
2
1
7 ; 95 . 0 1 ;
= + =
= =
= =
= = =

c
c
t t
a
n a
o
Copyright 2004 David J. Lilja 30
95% Confidence Interval
a
n 0.90 0.95 0.975

5 1.476 2.015 2.571
6 1.440 1.943 2.447
7 1.415 1.895 2.365

1.282 1.645 1.960
7 . 9
8
) 14 . 2 ( 365 . 2
94 . 7
1 . 6
8
) 14 . 2 ( 365 . 2
94 . 7
365 . 2
975 . 0 2 / 10 . 0 1 2 / 1
2
1
7 ; 975 . 0 1 ;
= + =
= =
= =
= = =

c
c
t t
a
n a
o
Copyright 2004 David J. Lilja 31
What does it mean?
90% CI = [6.5, 9.4]
90% chance real value is between 6.5, 9.4
95% CI = [6.1, 9.7]
95% chance real value is between 6.1, 9.7
Why is interval wider when we are more
confident?
Copyright 2004 David J. Lilja 32
Higher Confidence Wider
Interval?
6.5
9.4
90%
6.1 9.7
95%
Copyright 2004 David J. Lilja 33
Key Assumption
Measurement errors are
Normally distributed.
Is this true for most
measurements on real
computer systems?
c1 c2
1-
/2 /2
Copyright 2004 David J. Lilja 34
Key Assumption
Saved by the Central Limit Theorem
Sum of a large number of values from any
distribution will be Normally (Gaussian)
distributed.
What is a large number?
Typically assumed to be > 6 or 7.



Copyright 2004 David J. Lilja 35
How many measurements?
Width of interval inversely proportional to n
Want to minimize number of measurements
Find confidence interval for mean, such that:
Pr(actual mean in interval) = (1 )
| | x e x e c c ) 1 ( , ) 1 ( ) , (
2 1
+ =
Copyright 2004 David J. Lilja 36
How many measurements?
2
2 / 1
2 / 1
2 / 1
2 1
) 1 ( ) , (
|
.
|

\
|
=
=
=
=

e x
s z
n
e x
n
s
z
n
s
z x
x e c c
o
o
o

Copyright 2004 David J. Lilja 37


How many measurements?
But n depends on knowing mean and
standard deviation!
Estimate s with small number of
measurements
Use this s to find n needed for desired
interval width
Copyright 2004 David J. Lilja 38
How many measurements?
Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of
actual mean.
Copyright 2004 David J. Lilja 39
How many measurements?
Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of
actual mean.
= 0.90
(1-/2) = 0.95
Error = 3.5%
e = 0.035
Copyright 2004 David J. Lilja 40
How many measurements?
9 . 212
) 94 . 7 ( 035 . 0
) 14 . 2 ( 895 . 1
2
2 / 1
=
|
|
.
|

\
|
=
|
.
|

\
|
=

e x
s z
n
o
213 measurements
90% chance true mean is within 3.5% interval
Copyright 2004 David J. Lilja 41
Proportions
p = Pr(success) in n trials of binomial
experiment
Estimate proportion: p = m/n
m = number of successes
n = total number of trials
Copyright 2004 David J. Lilja 42
Proportions
n
p p
z p c
n
p p
z p c
) 1 (
) 1 (
2 / 1 2
2 / 1 1

+ =

o
o
Copyright 2004 David J. Lilja 43
Proportions
How much time does processor spend in
OS?
Interrupt every 10 ms
Increment counters
n = number of interrupts
m = number of interrupts when PC within OS
Copyright 2004 David J. Lilja 44
Proportions
How much time does processor spend in
OS?
Interrupt every 10 ms
Increment counters
n = number of interrupts
m = number of interrupts when PC within OS
Run for 1 minute
n = 6000
m = 658
Copyright 2004 David J. Lilja 45
Proportions
) 1176 . 0 , 1018 . 0 (
6000
) 1097 . 0 1 ( 1097 . 0
96 . 1 1097 . 0
) 1 (
) , (
2 / 1 2 1
=

n
p p
z p c c
o
95% confidence interval for proportion
So 95% certain processor spends 10.2-11.8% of its
time in OS
Copyright 2004 David J. Lilja 46
Number of measurements for
proportions
2
2
2 / 1
2 / 1
2 / 1
) (
) 1 (
) 1 (
) 1 (
) 1 (
p e
p p z
n
n
p p
z p e
n
p p
z p p e

o
o
o
Copyright 2004 David J. Lilja 47
Number of measurements for
proportions
How long to run OS experiment?
Want 95% confidence
0.5%
Copyright 2004 David J. Lilja 48
Number of measurements for
proportions
How long to run OS experiment?
Want 95% confidence
0.5%
e = 0.005
p = 0.1097
Copyright 2004 David J. Lilja 49
Number of measurements for
proportions
| |
102 , 247 , 1
) 1097 . 0 ( 005 . 0
) 1097 . 0 1 )( 1097 . 0 ( ) 960 . 1 (
) (
) 1 (
2
2
2
2
2 / 1
=

=

p e
p p z
n
o
10 ms interrupts
3.46 hours
Copyright 2004 David J. Lilja 50
Important Points
Use statistics to
Deal with noisy measurements
Aggregate large amounts of data
Errors in measurements are due to:
Accuracy, precision, resolution of tools
Other sources of noise
Systematic, random errors
Copyright 2004 David J. Lilja 51
Important Points: Model errors
with bell curve
True value
Precision
Mean of measured values
Resolution
Accuracy
Copyright 2004 David J. Lilja 52
Important Points
Use confidence intervals to quantify precision
Confidence intervals for
Mean of n samples
Proportions
Confidence level
Pr(actual mean within computed interval)
Compute number of measurements needed
for desired interval width

You might also like