You are on page 1of 44

CV2001/MT2301

ENGINEERING PROBABILITY &
STATISTICS/STATISTICS

Assoc Prof CHUANG Poon Hwei
Office: N1-01c-92
Tel: 6790-5301

CV2001/MT2301 Chapter 1 1
Part I –Probability
14 lectures

1. Introduction & Review of Mathematical Statistics.

2. Basic Concepts in Probability.

3. Mathematics of Probability.

4. Probability Distributions, generally.

5. Some Common Probability Distributions.

CV2001/MT2301 Chapter 1 2
Examination
◊Monday 23 November 2009 5:00pm

◊2.5-hour paper

◊Choose 4 out of 4 questions

◊Past Year Papers can be found in the NTU Library website
http://www.ntu.edu.sg/Library/Collections/exampapers/default.htm

CV2001/MT2301 Chapter 1 3
Pre-requisite

GCE ‟O‟ Level Mathematics or equiv.
GCE „A‟ Level Mathematics or equiv.

Important to do your own revision NOW

CV2001/MT2301 Chapter 1 4
CV2001/MT2301 Lecture & Tutorial Problems

For part 1 lecture notes and tutorial problems-hardcopy can be
purchased at the photocopy booth on basement 1 (near section
A) of block N1 (School of CEE)

Soft copy of tutorial problems is available in Edventure-Course
Documents of CV2001/MT2301 course site.

Please attempt the problems before attend the tutorial classes.

CV2001/MT2301 Chapter 1 5
Quiz

◊Date:

◊Time:

◊Venue: an LT (to be confirmed)

◊Duration: about 45 minutes (to be confirmed)

CV2001/MT2301 Chapter 1 6
TEXT
Jay L. Devore. “ Probability and Statistics for Engineering and the
Sciences”.7th Edition. Thomson, 2008.

REFFERENCES
Alfredo H-S. Ang, and Wilson H. Tang. “Probability Concepts in
Engineering: Emphasis on Applications to Civil and Environmental
Engineering ”. 2nd Edition Wiley, 2007. [TA340.A581P]

J. Susan Milton, and Jesse C. Arnold. “Introduction to Probability and
Statistics: Principles and Applications for Engineering and the
Computing Sciences”. Fourth Edition. McGraw-Hill, 2003.

CV2001/MT2301 Chapter 1 7
1. Introduction & Review of
Mathematical Statistics

CV2001/MT2301 Chapter 1 8
Route A: 27, 29, 29, 31, 31, 31, 31, 33, 33, 35 (min)
Route B: 15, 15, 25, 25, 30, 30, 40, 40, 45, 45 (min)

CV2001/MT2301 Chapter 1 9
Route A
Time(min) Freq. Cum. Freq. Rel. Freq. Cum. Rel.

fi
Freq
xi fi Fi
27 1 1 0.1 0.1
29 2 3 0.2 0.3
31 4 7 0.4 0.7
33 2 9 0.2 0.9
35 1 10 0.1 1.0
f i  10  f  10
i

CV2001/MT2301 Chapter 1 10
CV2001/MT2301 Chapter 1 11
Route A

Time(min) Freq.
xi fi f i xi ( xi  x ) ( xi  x ) 2 fi ( xi  x ) 2
27 1 27 -4 16 16
29 2 58 -2 4 8
31 4 124 0 0 0
33 2 66 2 4 8
35 1 35 4 16 16
10 310 48

CV2001/MT2301 Chapter 1 12
Sample mean

x
 fx

310
 31min
i i

 f ( n) 10
i

Sample variance

s2 
 fi ( xi  x ) 2

48
 5.33min 2
n 1 10  1

Sample standard deviation

48
s s  2
 2.31min
9

CV2001/MT2301 Chapter 1 13
Uncertainty in Engineering
In engineering, the actual outcomes of many processes are
unpredictable.
Experimental observations are different from one experiment to another
(even if under apparently identical conditions).

e.g.
Values of yield strength fy of steel bars produced by a factory on a certain
day (under the same conditions).

270.5, 280.3, 250.1, 300.1, ........, 260.4 N / mm2.

The manufacturer's specified yield strength fy = 275 N / mm2 .

CV2001/MT2301 Chapter 1 14
Some Terms:
(1) Population:
all conceivable observations (data) of the subject being studied.

e.g. to study the quiz results of CEE 2nd yr students, the results of
all CEE 2nd yr students.
Finite the total number of
students are fixed

e.g. all the numbers that occur when tossing a die indefinitely.
Infinite number of toss is
not fixed

e.g. when investigating the ultimate strength of stee bars produced
by a factory, the ultimate strength of all the steel bars
produced by the factory.
Finite (within a period),
Infinite.

CV2001/MT2301 Chapter 1 15
(2) Sample: a limited number of observations out of the population.

Impossible or prohibitively expensive to examine every population
member .

Hence, inference from a sample.

e.g.
ultimate strength test is destructive, a no. of bars (a sample) are
selected to be tested, inferences to evaluate the ultimate strength for
the entire population.

CV2001/MT2301 Chapter 1 16
To be representative of the entire population, the sample should
be "random".

(3) Random Sample:
a sample in which every possible observation of the
population has equal chance of being selected.

(4) Sample Size:
the total number of observations in a sample.

CV2001/MT2301 Chapter 1 17
Mathematical Statistics:

Collection, analysis and interpretation of data.

2 basic objectives:
(1) to describe (or summarize data);
(2) to make inferences from a sample about an entire population.

Statistics

Descriptive Inferential

CV2001/MT2301 Chapter 1 18
.
DESCRIPTIVE STATISTICS

Techniques to describe a set of data.

Descriptive
Statistics

Graphical Numerical
Methods Methods

CV2001/MT2301 Chapter 1 19
1.2 Graphical Representation (see Chapter 1 of Ang
&Teng)

Histograms or frequency diagrams
display data in a compact, understandable form → theoretical
model

(1) Absolute frequency (frequency):
Number of times, a value (or observation) xi occurs, fi .

(2) Relative frequency:

Absolute frequency divided by total number of observations, f j .

CV2001/MT2301 Chapter 1 20
Year Rainfall Year Rainfall
Intensity Intensity
(in.) (in.)
1918 43.30 1933 54.91
1919 53.02 1934 51.28
1920 63.52 1935 39.91
1921 45.93 1936 53.29
1922 48.26 1937 67.59
1923 50.51 1939 58.71
1924 49.57 1938 42.96
1925 43.93 1940 55.77
1926 46.77 1941 41.31
1927 59.12 1942 58.83
1928 54.49 1943 48.21
1929 47.38 1944 44.67
1930 40.78 1945 67.72
1931 45.05 1946 43.11
1932 50.37
CV2001/MT2301 Chapter 1 21
Observed rainfall intensity ranges from
39.91 to 67.72 in
Uniform interval of 4 in
38 in and 70 in 8 intervals
Interval Freq. Cum. Rel. Freq. Cum. Rel.
Freq. Freq.
38 - 42 3 3 0.1034 0.1034
42 - 46 7 10 0.2415 0.3449
46 - 50 5 15 0.1724 0.5173
50 - 54 5 20 0.1724 0.6897
54 - 58 3 23 0.1034 0.7931
58 - 62 3 26 0.1034 0.8965
62 - 66 1 27 0.0345 0.9310
66 - 70 2 29 0.0690 1.0000
~
 f j = 29  f j = 1.0000

CV2001/MT2301 Chapter 1 22
CV2001/MT2301 Chapter 1 23
Figure1.1 Histogram of rainfall
intensity (Esopus Creek
Watershed, N.Y., 1918-1946) (a)
In number of observations.(b) In
fraction of total observations. (c)
Frequency diagram of rainfall
intensity (Esopus Creek
Watershed, New York)

CV2001/MT2301 Chapter 1 24
Annual Rainfall Intensity, in.

CV2001/MT2301 Chapter 1 25
Example: A man kept count of the number of letters he
received everyday over a period of 100 days.
The observations are:

0 2 1 1 1 2 0 0 1 0 1 1 0 0 0 3 1 2 0 1
1 0 0 1 0 1 1 0 2 0 0 0 1 0 1 0 2 1 2 0
0 2 0 1 0 1 0 1 0 3 1 2 0 0 0 0 1 0 0 0
1 0 1 0 1 0 2 0 1 2 1 2 0 1 0 2 2 1 0 1
0 0 0 0 5 0 1 1 2 0 0 2 1 0 2 0 0 2 1 0
5 values: 0, 1, 2, 3, 5
Value Freq. Cum. Freq. Rel.
 Freq. Cum. Rel.
Xj fj fj fj Freq. Fj
0 48 48 0.48 0.48
1 32 80 0.32 0.80
2 17 97 0.17 0.97
3 2 99 0.02 0.99
4 0 99 0.00 0.99
5 1 100 0.01 1.00

CV2001/MT2301 Chapter 1 26
f j  100  f  1.00
j

r=6
6

f x
j 1
j j
0  48  1 32  2  17  3  2  4  0  5  1
x 6
  0.77
100
f
j 1
j

6 ~
  f j x j  0  0.48  1 0.32  2  0.17  3  0.02  4  0.  5  0.01  0.77
j 1

CV2001/MT2301 Chapter 1 27
CV2001/MT2301 Chapter 1 28
~

Sum of f i for all i equals 1 ~
fi
i 1

Relative frequencies used to define frequency function of the
distribution
~ ~ Each value of x has a
f ( xi )  fi corresponding relative frequency
described by the function f(x).

For each x  x j , the function f ( x ) equals the corresponding relative
~

freq. and equals 0 for every x not in the sample.
~ ~
Formula f ( x )  f j , x  x j , j  1,..., n
 0, otherwise

- Frequency function of the sample

CV2001/MT2301 Chapter 1 29
e.g.
~
f (0)  0.48,
~
f (x)
~
f (1)  0.32, 0.5
0.48

~
f (2)  0.17,
0.4 0.32

0.3
~ 0.17
f (3)  0.02, 0.2

~ 0.1 0.02

f (5)  0.01.
0.1

0 1 2 3 4 5

CV2001/MT2301 Chapter 1 30
(3) Cumulative frequency: for a certain value of x  xi

the sum of frequencies corresponding to x which are less
than or equal to xi

(4) Relative cumulative frequency: sum of corresponding relative
frequencies.

A function F ( x) representing relative cumulative frequency,
distribution function: ~
F(x )   f
i j i j

CV2001/MT2301 Chapter 1 31
1.3 NUMERICAL REPRESENTATION

(1) Measure of Central Tendency:
a value typical or representative of a set of data.

(2) Measure of Variability:
an indication of the degree to which a set of data is dispersed.

CV2001/MT2301 Chapter 1 32
Numerical representation
(i) Parameters: Numerical characteristics of a population.

e.g. population mean, population variance of ultimate strength.

Denoted by Greek letters, e.g. , 2 .

(ii) Statistics: numerical characteristics from a sample.

Denoted by English alphabets, e.g. x , s 2 .

A population parameter: a single fixed number characteristic of the
population (usually unknown),
A sample statistic is a number that could change from sample to
sample.

CV2001/MT2301 Chapter 1 33
Measure of Central Tendency

Gives a value typical or representative of a set of data.

The most common:
the arithmetic mean, usually referred to as the mean.

CV2001/MT2301 Chapter 1 34
Consider a sample of size n,
x1, x2, x3,..., xn.

(1) Mean x
x  x ..... x 1 n
Sample mean x  1 2 n   x
n n j 1 j

The true mean of a population,  ,only for a finite population when all

Normally, the sample mean x is an estimate for 

A sample contains only r distinct values x1, x2, x3,..., xr , with
absolute frequencies f1,f2,f3,...,fr , respectively,
r
 fjxj
j  1 1 r r ~
x   f x   f jx
r n j 1 j j j 1 j
 fj
j 1
CV2001/MT2301 Chapter 1 35
(2) Mode xm

The sample value xm that occurs most frequently.
In a frequency diagram, the sample value xm corresponding to the
peak.

Only one mode (unimodal) or more than one mode, e.g. two modes
(bimodal).

CV2001/MT2301 Chapter 1 36
~
(3) Median x
~
When the sample values are arranged in order of magnitude , x is the
middle value (or the average of the middle values for even n).

~ ~
At least 50% of the value are ≤ x and at least 50% of the value are ≥ x
~
On a cumulative relative frequency diagram, x is the value at which
~
F ( x)  0.5

x  m
xxx
xm
x

CV2001/MT2301 Chapter 1 37
Example 1.1
Observations of vehicle speeds in km/h:
60, 50, 70, 70, 50, 80, 60, 80, 70, 50, 70, 60

n = 12,  v  770  v 2
 50,700

Rearranged in ascending order
50, 50, 50, 60, 60, 60, 70, 70, 70, 70, 80, 80

(a)Mean x    64.17km / h
v
n

(b) Mode xm  70km / h
(occurring most often, f = 4)

~ 60  70
(c) Median x   65km / h
2

CV2001/MT2301 Chapter 1 38
Measure of Variability
A measure of variability (or dispersion): an indication of the degree to
which a set of data is dispersed.

Common measures:
range, variance, standard deviation, and coefficient of variation (C.O.V.).

(1) Range:
Difference between the largest and the smallest values of a sample
(or a population).
(2) Variance:
For a finite population of n nobservations with population mean 
1
Population variance    ( x j   ) where (xj - ): the deviation of
2 2

n j 1
xj from . square sign is to
cancel of the
negative sign

CV2001/MT2301 Chapter 1 39
For a sample, sample variance
1 n
s 
2

n  1 j 1
( x j  x)2

The deviations measured from x instead of  (unknown).

For convenience

1
[ x 2j  nx ]
2
s2  j = 1 to n
n 1
1 1
or  [ x 2j  ( x j )2 ]
n 1 n

* the unit of variance is square of that of the mean.

CV2001/MT2301 Chapter 1 40
(3) Standard deviation: the ≥ 0 square root of the variance,
sample standard deviation

1 n
s 
n  1 j 1
( x j  x)2

(4) Coefficient Of Variation (C.O.V.): measure of dispersion relative to
the mean:
. .   (dimensionless)
C.OV 

e.g. 1 1000mm , 1  50mm
 1m ,   0.05m
2 2
 50  0.05
C.OV. .  
1  2   0.05
1 1000 2 1

CV2001/MT2301 Chapter 1 41
Example 1.2
Tests on a sample of 20 concrete specimens gave the
following results in (N / mm2) :

38.6 36.5 27.6 30.4 37.9 39.3 41.4 38.6 49.0 32.4
37.9 40.7 44.1 40.0 46.2 37.2 34.5 40.0 42.8 38.6

n  20  x  773.8  x2  30419.2

CV2001/MT2301 Chapter 1 42
(a) Range = 49.0 - 27.6 = 21.4 N / mm2

773.8
(b) x  38.69 N / mm2
20
 ( x  x)2  480.878 , n -1 = 19

1
s2  (480.878)  25.309
19

1 [30419.2  20(38.69)2 ]  25.309
Alternatively, s2  19

1 [30419.2  1 (773.8)2 ]  25.309
or 19 20

s  5.03N / mm2,
. .  s  5.03  0.13
C.OV
x 38.69

CV2001/MT2301 Chapter 1 43
Descriptive Statistics (Summary)
Population, Sample, Random Sample

1 Graphical Representation:
Frequency, Relative freq., Cumulative freq.,
Relative cumulative freq..
Histogram.

2 Numerical Representation:
Parameter, Statistic.

(i) Measure of Central Tendency
Mean, Mode, Median

(ii) Measure of Variability
Range, Variance, Standard Deviation,
Coefficient of Variation (C.O.V.).

CV2001/MT2301 Chapter 1 44