Professional Documents
Culture Documents
DATA ANALYSIS
FACULTY OF ENGINEERING-SEMESTER5
TEACHER: ELSY WEHBE
2017-2018
Chapter 4: Normal distribution
Learning Objectives
These can potentially take on any value depending only on the ability to
precisely and accurately measure
Normal Distribution - Introduction
Most important
Gaussian Distribution
Bell-shaped curve
Continuous
Symmetric around the mean, µ and all the averages
(mean, mode and median coincide)
The Normal Distribution
‘Bell Shaped’
Symmetrical f(X)
Mean, Median and Mode
are Equal
Location is determined by the mean, σ
μ X
μ
Spread is determined by the
standard deviation, σ
Mean
= Median
The random variable has an infinite = Mode
theoretical range:
+ to
Normal Distribution - Areas
Physical measurements: (heights & weights)
◦ Meteorogical experiments
◦ Rainfall studies
◦ Measurements of manufacturing parts
◦ Errors in scientific measurements
When underlying distribution is discrete -> excellent approximation
When individual variables are not normally distributed -> sums and averages of the variables
(under suitable conditions) have approximately normal distributions (Central limit Theorem).
Normal Distribution - Definition
A continuous r.v. X is said to have a normal distribution with parameters µ and σ (or µ and σ 2), where
–∞ < µ < ∞ and
0 < σ,
if the probability distribution function of X is
2
1 x
1
2
f ( x) e , x
2
Symbolically, X ~ N(µ ,σ 2)
The Normal Distribution
Density Function
2
1 (X μ)
1
2
f(X) e
2π
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Normal Distribution - Properties
1. The curve extends indefinitely to the left and to the right,
approaching the x-axis as x increases, i.e. as
x , f(x) 0.
2. The mode occurs at x=.
3. The curve is symmetric about a vertical axis through the mean .
4. The total area under the curve and above the horizontal axis is
equal to 1. 1 x
2
1
i.e.
2 e
2
dx 1
Many Normal Distributions
μ X
Normal Distribution - Graphs
z 2
1
f ( z;0,1) e 2
, z
2
The cumulative distribution function of Z is denoted by
z
P( Z z ) f ( y;0,1)dy
Z
0
Values above the mean have positive Z-values, values below the mean have
negative Z-values
The Standardized Normal
X μ
Z
σ
The Z distribution always has mean = 0 and standard
deviation = 1
Z-Score
Each data value can be converted to a z-score using the formula for
standardization:
x
z
Think of Z as the measure of the distance from the mean, measured in standard
deviations!!!
Each data value can be located on the x axis of the density curve.
Z-Score
The mean of Z is zero and the variance is 1 respectively,
X X
E (Z ) E Var ( Z ) Var
1 1
E X 2 Var ( X )
1 1
[E( X ) ] 2 Var ( X )
0 1
2 2
1
Diagrammatic of the
standardizing process
Convert X ~ N(, 2) to Z ~ N(0, 1).
Whenever X is between the values x=x1 and x=x2, Z will fall between the
corresponding values z=z1 and z=z2, we have P(x1 < X < x2) = P(z1 < Z < z2).
x
z
Standard Normal Curve Areas
x
z
Standard Normal Curve Areas
Φ(z) means the area under the curve on the left of z
Standard Normal Curve Areas
Φ(0.24) means the area under the curve on the left of 0.24 and is this
value here:
Example
If X is distributed normally with mean of $100 and standard deviation
of $50, the Z value for X = $200 is
X μ $200 $100
Z 2.0
σ $50
This says that X = $200 is two standard deviations (2 increments of $50
units) above the mean of $100.
Comparing X and Z
units
X
a b
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is symmetric, so half is
above the mean, half is below
f(X)
P( X μ) 0.5 P(μ X ) 0.5
0.5 0.5
X
μ
P( X ) 1.0
The Standardized Normal Table
The Cumulative Standardized Normal table in the textbook
(Appendix table E.2) gives the probability less than a desired value of Z
(i.e., from negative infinity to Z)
0.9772
Example:
P(Z < 2.00) = 0.9772
0 2.00 Z
The Standardized Normal Table
(continued)
Chap 6-33
Finding Normal Probabilities
(continued)
Let X represent the time it takes, in seconds to download an image file from the
internet.
Suppose X is normal with a mean of 18.0 seconds and a standard deviation of 5.0
seconds. Find P(X < 18.6)
X μ 18.6 18.0
Z 0.12
σ 5.0
μ = 18 μ=0
σ=5 σ=1
18 18.6 X 0 0.12 Z
P(X < 18.6) P(Z < 0.12)
Solution: Finding P(Z < 0.12)
X
18.0
18.6
COPYRIGHT ©2013 PEARSON EDUCATION, INC. PUBLISHING AS PRENTICE HALL
Chap 6-36
Finding Normal
Upper Tail Probabilities
(continued)
0.5478
1.000 1.0 - 0.5478 =
0.4522
Z Z
0 0
0.12 0.12
Finding a Normal Probability
Between Two Values
Suppose X is normal with mean 18.0 and standard deviation 5.0. Find
P(18 < X < 18.6)
Calculate Z-values:
X μ 18 18
Z 0
σ 5
18 18.6 X
X μ 18.6 18 0 0.12 Z
Z 0.12
σ 5 P(18 < X < 18.6)
= P(0 < Z < 0.12)
Solution: Finding P(0 < Z < 0.12)
P(18 < X < 18.6)
Standardized Normal Probability
Table (Portion) = P(0 < Z < 0.12)
= P(Z < 0.12) – P(Z ≤ 0)
Z .00 .01 .02 = 0.5478 - 0.5000 = 0.0478
X
18.0
17.4
Probabilities in the Lower Tail (continued)
0.4522
X μ Zσ
Chap 6-42
Finding the X value for a Known
Probability
(continued)
Example:
Let X represent the time it takes (in seconds) to download an
image file from the internet.
Suppose X is normal with mean 18.0 and standard deviation
5.0
Find X such that 20% of download times are less than X.
0.2000
? 18.0 X
? 0 Z
Chap 6-43
Find the Z-value for
20% in the Lower Tail
X μ Zσ
18.0 (0.84)5.0
13.8
Chap 6-45
Using Excel With The Normal
Distribution
Finding Normal Probabilities
Chap 6-48
Evaluating Normality (continued)
Chap 6-50
The Quantile-Quantile Normal Probability Plot
Interpretation
A quantile-quantile normal probability
plot for data from a normal distribution
will be approximately linear:
X 90
60
30
-2 -1 0 1 2 Z
Chap 6-51
Quantile-Quantile Normal Probability Plot
Interpretation (continued)
Left-Skewed Right-Skewed
X 90 X 90
60 60
30 30
-2 -1 0 1 2 Z -2 -1 0 1 2 Z
Rectangular
Nonlinear plots indicate a
X 90 deviation from normality
60
30
-2 -1 COPYRIGHT Z
0 1©20132PEARSON EDUCATION, INC. PUBLISHING AS PRENTICE HALL
Chap 6-52
Normal Probability Plots In Excel &
Minitab
In Excel normal probability plots are quantile-quantile normal probability plots and the
interpretation is as discussed
The Minitab normal probability plot is different and the interpretation differs slightly
As with the Excel normal probability plot a linear pattern in the Minitab normal probability plot
indicates a normal distribution
Chap 6-53
Normal Probability Plots In Minitab
In Minitab the variable on the x-axis is the variable under study.
The variable on the y-axis is the cumulative probability from a normal distribution.
For a variable with a distribution that is skewed to the right the plotted points will rise quickly at
the beginning and then level off.
For a variable with a distribution that is skewed to the left the plotted points will rise more
slowly at first and rise more rapidly at the end
Chap 6-55
Evaluating Normality (continued)
An Example: Bond Funds Returns
Descriptive Statistics • The mean (7.1641) is greater than the median (6.4).
(In a normal distribution the mean and median are
equal.)
• The interquartile range of 7.4 is approximately 1.21
standard deviations. (In a normal distribution the
interquartile range is 1.33 standard deviations.)
• The range of 40.8 is equal to 6.70 standard
deviations. (In a normal distribution the range is 6
standard deviations.)
• 73.91% of the observations are within 1 standard
deviation of the mean. (In a normal distribution this
percentage is 68.26%.
• 85.33% of the observations are within 1.28 standard
deviations of the mean. (In a normal distribution
this percentage is 80%.)
COPYRIGHT ©2013 PEARSON EDUCATION, INC. PUBLISHING AS PRENTICE HALL
Chap 6-56
Evaluating Normality (continued)
An Example: Bond Funds Returns
Descriptive Statistics • 96.20% of the returns are within 2 standard
deviations of the mean. (In a normal distribution,
95.44% of the values lie within 2 standard deviations
of the mean.)
• The skewness statistic is 0.9085 and the kurtosis
statistic is 2.456. (In a normal distribution each of
these statistics equals zero.)
Chap 6-57
Evaluating Normality (continued)
An Example: Bond Funds Returns
Quantile-Quantile Normal Probability Plot From Excel