You are on page 1of 39

LECTURE 3: ANALYSIS OF

EXPERIMENTAL DATA
Mochamad Safarudin
Faculty of Mechanical Engineering, UTeM
2010
MEASUREMENT AND INSTRUMENTATION
BMCC 3743
Introduction
Measures of dispersion
Parameter estimation
Criterion for rejection questionable data
points
Correlation of experimental data
2
Needed in all measurements with random
inputs, e.g. random broadband sound/noise
Tyre/road noise, rain drops, waterfall
Some important terms are:
Random variable (continuous or discrete),
histogram, bins, population, sample, distribution
function, parameter, event, statistic, probability.
3
Population : the entire collection of objects,
measurements, observations and so on
whose properties are under consideration
Sample: a representative subset of a
population on which an experiment is
performed and numerical data are obtained
4
Introduction
Measures of dispersion
Parameter estimation
Criterion for rejection questionable data
points
Correlation of experimental data
5
Deviation (error) is defined as

Mean deviation is defined as


Population standard deviation is defined
as
x x d
i i
=

=
n
i
i
n
d
d
1
6
=>Measures of data spreading or variability
( )

=
N
i
i
N
x
1
2

o
Sample standard deviation is defined as


is used when data of a sample are used to
estimate population std dev.
Variance is defined as

( )
( )

=
n
i
i
n
x x
S
1
2
1
sample a f or S
or
population the f or
2
2
o
7
Find the mean, median, standard deviation
and variance of this measurement:
1089, 1092, 1094, 1095, 1098, 1100, 1104, 1105,
1107, 1108, 1110, 1112, 1115
8
Mean = 1103 (1102.2)
Median = 1104
Std deviation = 5.79 (7.89)
Variance = 33.49 (62.18)

9
Introduction
Measures of dispersion
Parameter estimation
Criterion for rejection questionable data
points
Correlation of experimental data
10
Generally,
Estimation of population mean,
is sample mean, .
Estimation of population
standard deviation, is sample
standard deviation, S.

x
11
o




Confidence interval is the interval between
to , where is an uncertainty.
Confidence level is the probability for the
population mean to fall within specified
interval:

12
( ) o o + s s x x P
o
o x o + x
Normally referred in terms of , also called
level of significance, where
confidence level
If n is sufficiently large (> 30), we can apply
the central limit theorem to find the
estimation of the population mean.

13
o
o =1
1. If original population is normal, then
distribution for the sample means is
normal (Gaussian)
2. If original population is not normal and n
is large, then distribution for sample
means is normal
3. If original population is not normal and n
is small, then sample means follow a
normal distribution only approximately.
14
When n is large,

where

Rearranged to get


Or with confidence level
15
o
o

o o
=
(

s 1
/
2 / 2 /
z
n
x
z P
n
x
z
/ o

=
n
z x
o

o 2 /
=
o 1
o
o

o
o o
=
(

+ s s 1
2 / 2 /
n
z x
n
z x P
16
Table z
Confidence
Interval
Confidence Level
(%)
Level of Significance
(%)
3.30 99.9 0.1
3.0 99.7 0.3
2.57 99.0 1.0
2.0 95.4 4.6
1.96 95.0 5.0
1.65 90.0 10.0
1.0 68.3 31.7
Area under 0 to z
When n is small,

where

Rearranged to get


Or with confidence level
17
o

o o
=
(

s 1
/
2 / 2 /
t
n S
x
t P
n S
x
t
/

=
o
o o
=
(

+ s s 1
2 / 2 /
n
S
t x
n
S
t x P
n
S
t x
2 / o
=
o 1
t table
Similarly as before, but now using chi-
squared distribution, , (always positive)


where
18
2
_
( ) o _
o
_
o o
=
(

s s

1 1
2
2 / ,
2
2
2
2 / 1 , v v
S
n P
( )
2
2
2
1
o
_
S
n =
Hence, the confidence interval on the
population variance is



19
( ) ( )
2
2 / 1 ,
2
2
2
2 / ,
2
1 1
o o
_
o
_

s s

v v
S n S n
Chi squared table
Introduction
Measures of dispersion
Parameter estimation
Criterion for rejection
questionable data points
Correlation of experimental data
20
To eliminate data which has low probability of
occurrence => use Thompson test.
Example: Data consists of nine values,
D
n
= 12.02, 12.05, 11.96, 11.99, 12.10, 12.03,
12.00, 11.95 and 12.16.
= 12.03, S = 0.07
So, calculate deviation:

21
t
08 . 0 03 . 12 95 . 11
13 . 0 03 . 12 16 . 12
2
arg 1
= = =
= = =
D D
D D
smallest
est l
o
o
D
From Thompsons table, when n = 9, then

Comparing with
where then D
9
= 12.16 should be
discarded.
Recalculate S and to obtain 0.05 and 12.01
respectively.
Hence for n = 8, and
so remaining data stay.
22
t
777 . 1 = t
12 . 0 77 . 1 07 . 0 = = t S
, 13 . 0
1
= o
,
1
t o S >
D
749 . 1 = t
, 09 . 0 = t S
Thompsons t table
Introduction
Measures of dispersion
Parameter estimation
Criterion for rejection questionable data
points
Correlation of experimental
data
23
A) Correlation coefficient
B) Least-square linear fit
C) Linear regression using data
transformation
24
Case I: Strong, linear relationship between x
and y
Case II: Weak/no relationship
Case III: Pure chance
=> Use correlation coefficient, r
xy
to
determine Case III
25
Given as


where
+1 means positive slope (perfectly linear
relationship)
-1 means negative slope (perfectly linear
relationship)
0 means no linear correlation
26
( )( )
( ) ( )
2 / 1
1 1
2 2
1
(

= =
=
n
i
n
i
i i
i
n
i
i
xy
y y x x
y y x x
r
1 1 + s s
xy
r
In practice, we use special Table (using
critical values of r
t
) to determine Case III.
If from experimental value of |r
xy
|

is equal
or more than r
t
as given in the Table, then
linear relationship exists.
If from experimental value of |r
xy
|

is less
than r
t
as given in the Table, then only pure
chance => no linear relationship exists.
27
To get best straight line on the plot:
Simple approach: ruler & eyes
More systematic approach: least squares
Variation in the data is assumed to be normally
distributed and due to random causes
To get Y = ax + b, it is assumed that Y values are
randomly vary and x values have no error.


28
For each value of x
i
, error for Y values are

Then, the sum of squared errors is



29
i i i
y Y e =
( ) ( )

= =
+ = =
n
i
i i
n
i
i i
y b ax y Y E
1
2
1
2
Minimising this equation and solving it for a
& b, we get
30
( )( )
( )
( )( )
( )
2
2
2
2
2



=
i i
i i i i i
i i
i i i i
x x n
y x x y x
b
x x n
y x y x n
a
Substitute a & b values into Y = ax + b,
which is then called the least-squares
best fit.
To measure how well the best-fit line
represents the data, we calculate the
standard error of estimate, given by


where S
y,x
is the standard deviation of the
differences between data points and the
best-fit line. Its unit is the same as y.

31
2
1 1 1
2
,


=

n
y x a y b y
S
i
x y
Is another good measure to determine how
well the best-fit line represents the data,
using



For a good fit, must be close to unity.
32
( )
( )

+
=
2
2
2
1
y y
y b ax
r
i
i i
2
r
For some special cases, such as

Applying natural logarithm at both sides,
gives

where ln(a) is a constant, so ln(y) is linearly
related to x.
33
bx
ae y =
( ) ( ) a bx y ln ln + =
Thermocouples are usually approximately linear
devices in a limited range of temperature. A
manufacturer of a brand of thermocouple has
obtained the following data for a pair of
thermocouple wires:
T(
0
C)
20 30 40 50 60 75 100
V(mV)
1.02 1.53 2.05 2.55 3.07 3.56 4.05
Determine the linear correlation between T and V
Solution:
Tabulate the data using this table:
( )( )
( ) ( )
2 / 1
1 1
2 2
1
(

= =
=
n
i
n
i
i i
i
n
i
i
xy
y y x x
y y x x
r
r
xy
= 0.980392
No x (
0
C) y(mV)
1 20 1.02 -33.57 1127.04 -1.53 2.33 51.27
2 30 1.53 -23.57 555.61 -1.02 1.03 23.98
3 40 2.05 -13.57 184.18 -0.50 0.25 6.75
4 50 2.55 -3.57 12.76 0.00 0.00 -0.01
5 60 3.07 6.43 41.33 0.52 0.27 3.36
6 75 3.56 21.43 459.18 1.01 1.03 21.70
7 100 4.05 46.43 2155.61 1.50 2.26 69.78
53.57
2.55
4535.71 7.17 176.82
x
y
x x
i
y y
i
) ( x x
i
) ( y y
i

2
) ( x x
i
( )
2
y y
i

Another example
The following measurements were obtained in the calibration of
a pressure transducer:

Voltage AP H
2
O
0.31 1.96
0.65 4.20
0.75 4.90
0.85 5.48
0.91 5.91
1.12 7.30
1.19 7.73
1.38 9.00
1.52 9.90
a. Determine the best fit
straight line
b. Find the coefficient of
determination for the
best fit
x
i
x
i
2
y
i
x
i
y
i
y
i
2
0.31 0.0961 1.96 0.6076 3.8416
0.65 0.4225 4.2 2.73 17.64
0.75 0.5625 4.9 3.675 24.01
0.85 0.7225 5.48 4.658 30.0304
0.91 0.8281 5.91 5.3781 34.9281
1.12 1.2544 7.3 8.176 53.29
1.19 1.4161 7.73 9.1987 59.7529
1.38 1.9044 9 12.42 81
1.52 2.3104 9.9 15.048 98.01
sum () 8.68 9.517 56.38 61.8914 402.503
( )( )
( )
( )( )
( )
2
2
2
2
2



=
i i
i i i i i
i i
i i i i
x x n
y x x y x
b
x x n
y x y x n
a
a= 6.560646
b= -0.062934
Y=6.56x-0.06
( )
( )

+
=
2
2
2
1
y y
y b ax
r
i
i i
0.999926
r
2
=
x
i
y
i
(Y
i
-y
i
)
2
(y
i
-y)
2
0.31 1.96 0.000118 18.53
0.65 4.2 0.000002 4.26
0.75 4.9 0.001802 1.86
0.85 5.48 0.001130 0.62
0.91 5.91 0.000008 0.13
1.12 7.3 0.000225 1.07
1.19 7.73 0.000203 2.15
1.38 9 0.000085 7.48
1.52 9.9 0.000086 13.22
sum () 0.003659 49.31
From the result before we can find coeff of determination r
2
by tabulating the following values

Experimental Uncertainty Analysis



End of Lecture 3
39

You might also like