You are on page 1of 8

Page 1 Lecture 2 – MEEN 260 Dr. R.

Tafreshi

Lecture 2: Introduction to Statistical Data Analysis

Lecture Outline:
• Introduction to Statistical Data Analysis
o Purpose of experimentation
o Sources of measurement error
o Histograms
o Population vs. Sample
o Probability Density Functions (PDF)
o Properties of a PDF

o Example of Previous Projects

Announcement:
• Lab Safety Acknowledgement (LSA)

Reading Assignments:
• Chapter 3 of textbook by Beckwith

Next Lecture: Review of Statistical Distributions


Page 2 Lecture 2 – MEEN 260 Dr. R. Tafreshi

Significance of Mechanical Measurements:

We may measure for two purposes:


1. For the purpose of Design. Mechanical design elements:
a. Experience element (exposure to similar experience)
b. Rational element (physics laws)
c. Experimental element (measurement of various quantities in experiment)

2. For the purpose of Control Process:


a. Discrepancy between the actual and the desired performance of a system
(temperature, flow, pressure, etc. in a power station)
- Goal: to maintain or track the desired value

Purpose of experimentation

Exploratory
• Collect data so one can later find correlations among measured quantities (empirical
models)
o Example: Bolt fatigue failure as a function of loading cycles

Fatigue stress range: Maximum stress range for indefinite design life:
S f = a / ( N cycles )b

Validation
• Given a theoretical model or a preliminary
design, perform experiments to validate or
invalidate the model/design
o Example: friction modeling

This requires meaningful data!

• Data always has error in it!

How to assess the source and magnitude of errors?


w to assess the source and
• Validation: magnitude
Given of model,
a theoretical errors?perform experiments that validate or invalidate
the model. This includes design validation/qualification as well!
- Examples:
Page 3 Lecture 2 – MEEN 260 Dr. R. Tafreshi
• Others? 1 of 15
Error = Measured Value – True Value
DEPARTMENT OF MECHANICAL ENGINEERING
--> Need to ensureCOLLEGE
that weUNIVERSITY
TEXAS A&M have meaningful data!
STATION, TX 77843-3123
e = xm - xtrue 979 845 1251
FAX 979 845 3081
What –is meaningful
We never know data?the true value
-->Data– always
We can hasonly
errorestimate
in it! the error
– We can estimate probable bounds on the error (uncertainty)
-->How to assess
T E X A the
S A source
& M U Nand I V Emagnitude
R S I T Y of errors?

Sources of Measurement Error


Statistical
R. Langari, 8/23/11 1. Systematic orData
ConstantProcessing
or Bias Errors, e.g., sensor calibration errors, certain
1 of 15set-up

errors
r Sources DEPARTMENT OF MECHANICAL ENGINEERING
TEXAS A&M UNIVERSITY
o Statistical analysis cannot reveal them, as the error does not involve a
COLLEGE STATION, TX 77843-3123
979 845 1251
distribution (these errors are constant)
FAX 979 845 3081

tematic or Bias Errors, e.g. sensor calibration errors, certain set-up errors

T E X A S A & M
xt True value
U N I V E R S I T Y
xs
xt xs Sensed value
Statistical Data Processing
x = x + x s o t
Error Sources
xo
x t e.g. sensor calibration errors, certain set-up errors
• Systematic or Bias Errors,

cision or Random Errors, e.g. sensor resolution errors, some human


Who is a good liar?! x errors, noise
True value
x t
s
A good sensor is the one
x t that tells the same liex overxand over! Can repeat its lies!
s Sensed value
x
s x = x + x
oerrors,t some human errors,
2. Precision or Random Errors, e.g. sensor resolution
xo
electrical noise, friction
o This error is different for
x t each successive measurement, but the
error average is zero (when many measurements are taken)
• Precision or Random Errors, e.g. sensor resolution errors, some human errors, noise
t
x x
2 of 15

DEPARTMENT OF MECHANICAL ENGINEERING


TEXAS A&M UNIVERSITY
COLLEGE STATION, TX 77843-3123
979 845 1251
FAX 979 845 3081

t
Page 20 of 134 R. Langari, Fall 2011
R. Langari, 8/23/11 2 of 15
o Statistical analysis can be applied to estimate the possible size of error
DEPARTMENT OF MECHANICAL ENGINEERING
o By repeating measurements and applying statistical analysis, we can calculate
TEXAS A&M UNIVERSITY
COLLEGE STATION, TX 77843-3123
probable bounds on the magnitude of the error
979 845 1251
FAX 979 845 3081

§ e.g. I am 99% certain that my true height is between 168 and 169 cm
MEEN 260 Notes Page 20 of 134 R. Langari, Fall 2011
Page 4 Lecture 2 – MEEN 260 Dr. R. Tafreshi

Statistical Data Processing

Probability distribution (Page 45 of textbook)


Histograms:
• A graphical representation of the distribution of data
o An estimate of the probability distribution of a continuous variable
• Divide range of data into “bins”
• Count # of samples that fall into each bin or range
• Plot the result

Example:
• Test scores for a course

Review of Statistical Distributions: Three examples of different statistical distributions


Uniform distribution: Each possible outcome is equally likely to occur.
Example: When a die is tossed, there are 6 possible outcomes: S = {1, 2, 3, 4, 5, 6}. Each
possible outcome (a random variable: X) is equally likely to occur. Thus, this is a uniform
distribution: the P(X = 6) = 1/6.

Normal distribution: Example: Throw a pair of dice. The distribution of results (histogram)
for the sum of a pair of dice will produce approximations to normal distributions (find the
possible outcome of the sum of a pair of dice): P45 of the textbook

Exponential Distribution: is commonly used to model waiting times between occurrences of


rare events: lifetimes of electrical or mechanical devices.
Page 5 Lecture 2 – MEEN 260 Dr. R. Tafreshi

Normal (Gaussian) Probability Distribution Function

Let us denote the quantity to be determined; e.g. temperature in a room by x. The true value
of x, which we denote by xtrue or xt , is, and will remain, unknown, although we can find a
close estimate of xt following the discussion given below.

First consider the hypothetical case that infinite number of measurements of x can be made.
These measurements will have a “mean” or average value which we denote by µ or more
accurately by µ x . If these measurements are not biased and are only affected by random or
precision errors, then it is generally the case that µ x is the closest possible approximation of
xt .

Moreover, assume that the distribution of x around this mean value is Gaussian, i.e. follows
the so called normal or Gaussian probability distribution function discussed below.

Analysis of Random Errors


• To analyze random errors, we need some background in statistics…
o A random variable
§ occurs according to an assumed probability
o Probability distributions
§ describes how likely a particular outcome is

In the following, we review these concepts in more detail.


Page 6 Lecture 2 – MEEN 260 Dr. R. Tafreshi

General Definitions in Statistics

Population vs. Sample

Definitions:
• µ: Population mean
o Average of the complete set of all possible values
o In measurement systems: if there is only random errors in our measurements
(no systematic error), then the population mean is the closest possible
approximation of the true value

• x: Sample mean
o average of set of random samples (a set measurements)
o as we take more measurements (more samples), the sample mean gets closer to
the population mean
1 n
x = å xi
n i =1

• s: Standard deviation of the population


o Population variance, s 2

• sx : Sample deviation
o Sample variance, sx2
o As we take more measurements (more samples), the sample deviation gets
closer to the standard mean
1 n
sx2 = å ( xi - x )2
n - 1 i =1

Definitions:
• x: the quantity to be determined
• xt : the true value (which is always unknown)
• x1 , x2 ,..., xn : the measured values
• n: number of samples
Page 7 Lecture 2 – MEEN 260 Dr. R. Tafreshi

Consider:
• Size of a population is infinite è The case of measurement
• Datum x is a measurement of one quantity
• Each datum x differs from the rest ONLY because of precision error

Þ Probability distribution of x is described by a probability density function (PDF)


• Population is infinite è PDF is a continuous curve

Probability Density Functions (PDF) – Gaussian Distribution


• In measurements we will assume that random errors are “normally” distributed
o a.k.a. Gaussian or Normal distribution or bell curve

Properties of a PDF:
• Positive p(x)≥ 0
¥
• Unit Area: ò p( x)dx = 1

• Probability of a sample occurring within a given range is the area under the curve:
b
P(a £ x £ b) = ò p ( x)dx
a

Note: because of infinite population both µ and s are unknown

Role of s (the
standard deviation of
the entire population):
Page 8 Lecture 2 – MEEN 260 Dr. R. Tafreshi

Effect of STD:

Exam grades:
P1/12 P2/12 P3/8 P4/8 Total/40 Total/100
Average 9.5 9.5 6.3 6.7 32.2 80.6
Standard Deviation 1.8 1.5 1.7 1.6 4.3 10.8
Normalized STD 1.5 1.3 2.1 2.0

You might also like