You are on page 1of 31

CHAPTER 4

THE NORMAL DISTRIBUTION

QMT554 DATA ANALYSIS 1


Introduction
 A normal distribution is also known as the bell
curve or the Gaussian distribution.
 Special continuous distribution and widely
used.
 A large number of phenomena in the real
world are normally distributed either exactly or
approximately.
 Example: _________________________

QMT554 DATA ANALYSIS 2


Introduction

 Most of the data – normally/


approximately normally distributed
X ~ N ,  2  
 Equation of the normal distribution
2
1  x  
1   
f  x  e 2  
, where -     ,   0
 2

QMT554 DATA ANALYSIS 3


Shape of normal curve

QMT554 DATA ANALYSIS 4


Features of Normal distribution
 The shape and position of the normal
distribution curve depend on two parameters,
the mean and the standard deviation.
 Each normally distributed variable has its own
normal distribution curve, which depends on
the values of the variable’s mean and standard
deviation.
 The normal distribution curve is bell-shaped.
 The total area under the curve is 1.0

QMT554 DATA ANALYSIS 5


Features of Normal distribution
 The curve is symmetric about the mean
Consequently ½ of the area under a normal
distribution curve lies on the left side of the
mean and ½ lies on the right side of the mean
 The mean, median, and mode are equal and
located at the center of the distribution.
 The normal distribution curve is unimodal.
 The two tails of the curve extend indefinitely,
without touching or crossing the horizontal axis.

QMT554 DATA ANALYSIS 6


Features of Normal distribution

QMT554 DATA ANALYSIS 7


Three normal distribution curves with the
same mean but different standard deviations.

σ=5

σ = 10

σ = 16

μ = 50 x
QMT554 DATA ANALYSIS 8
Three normal distribution curves with
different means but the same standard
deviation.

σ=5 σ=5 σ=5

µ = 20 µ = 30 µ = 40 x

QMT554 DATA ANALYSIS 9


Standard Normal Distribution
 Since each normally distributed variable has its own
mean and standard deviation, the shape and location of
these curves will vary.
 Probability Calculations with Normal Distributions;
b
P a  X  b   f  x  dx

a
2
1  x  
1   
where f  x  e 2  
, for      ,   0
 2

 Difficult to integrate. Therefore, to simplify this,


statisticians use Standard Normal Distribution Table.

QMT554 DATA ANALYSIS 10


Standard Normal Distribution
The standard normal distribution is a normal
distribution with a mean of 0 and a standard
 
deviation of 1. Z ~ N ,  2  N 0,1

-3 -2 -1 0 1 2 3

QMT554 DATA ANALYSIS 11


z Values

 The z value is the number of standard


deviations that a particular X value is
away from the mean. The formula for
finding the z value is:

X
Z , where Z ~ N 0,1

QMT554 DATA ANALYSIS 12


Finding areas under the standard
normal distribution curve
 The standard normal distribution table lists the
areas under the standard normal curve to the
left of z-values from –3.49 to 3.49.
 Although the z-values on the left side of the
mean are negative, the area under the curve is
always positive.

QMT554 DATA ANALYSIS 13


 Find the area under the standard normal
curve to the left of z = 1.95 or P(z<1.95)

QMT554 DATA ANALYSIS 14


Example 1

 Refer to the standard normal table, find


the following probabilities
a) P(Z>1.54)
b) P(Z>0.87)
c) P(Z<0.65)
d) P(Z<-0.72)
e) P(Z>-0.95)

QMT554 DATA ANALYSIS 15


Example 2

 Find the following probabilities ;


a) P(1.22<Z<2.40)
b) P(-0.75<Z<1.29)
c) P(0<Z<0.4)
d) P(-1.77<Z<-0.88)

QMT554 DATA ANALYSIS 16


Example 3

 If the random variable X is normally


distributed with mean 125 and variance
16, find the following probabilities
a) P(X>130)
b) P(X<128)
c) P(120<X128)

QMT554 DATA ANALYSIS 17


Example 4

Let x be a continuous random variable


that has a normal distribution with a
mean of 80 and a standard deviation of
12. Find the area under the normal
distribution curve
a) from x = 70 to x = 135
b) to the left of 27

QMT554 DATA ANALYSIS 18


APPLICATIONS OF THE
NORMAL DISTRIBUTION

QMT554 DATA ANALYSIS 19


Example 5

Suppose the prices of all three-year old


Porsche 911 sports cars have a normal
distribution with a mean price of $48,125
and a standard deviation of $1600. Find
the probability that a randomly selected
three-year-old Porsche 911 will sell for a
price between $46,000 and $49,000.

QMT554 DATA ANALYSIS 20


Example 6

A racing car is one of the many toys


manufactured by Mark Corporation. The
assembly times for this toy follow a normal
distribution with a mean of 55 minutes and a
standard deviation of 4 minutes. The company
closes at 5 P.M. everyday. If one worker starts to
assemble a racing car at 4 P.M., what is the
probability that she will finish this job before the
company closes for the day?

QMT554 DATA ANALYSIS 21


Example 7

 Hupper Corporation produces many types of soft


drinks, including Orange Cola. The filling
machines are adjusted to pour 12 ounces of soda
into each 12-ounce can of Orange Cola. However,
the actual amount of soda poured into each can is
not exactly 12 ounces; it varies from can to can. It
has been observed that the net amount of soda in
such a can has a normal distribution with a mean
of 12 ounces and a standard deviation of 0.015
ounce.

QMT554 DATA ANALYSIS 22


a) What is the probability that a randomly
selected can of Orange Cola contains
11.97 to 11.99 ounces of soda?
b) What percentage of the Orange Cola
cans contain
12.02 to 12.07 ounces of soda?

QMT554 DATA ANALYSIS 23


Example 8

The life span of a calculator manufactured by


Texas Instruments has a normal distribution with
a mean of 54 months and a standard deviation of
8 months. The company guarantees that any
calculator that starts malfunctioning within 36
months of the purchase will be replaced by a new
one. About what percentage of calculators made
by this company are expected to be replaced?

QMT554 DATA ANALYSIS 24


Example 9

A machine produces component with an


average length of 100 cm and standard
deviation 5 cm. The lengths are normally
distributed. If a component is chosen at
random, find the probability that the
length is
a) more than 105cm
b) less than 90cm
c) between 90cm and 105cm
QMT554 DATA ANALYSIS 25
Finding an x Value for a Normal
Distribution

For a normal curve, with known values of μ


and σ and for a given area under the curve
between the mean and x, the x value is
calculated as
x = μ + zσ

QMT554 DATA ANALYSIS 26


Example 10

It is known that the life of a calculator


manufactured by Texas Instruments has a
normal distribution with a mean of 54
months and a standard deviation of 8
months. What should the warranty period
be to replace a malfunctioning calculator if
the company does not want to replace
more than 1% of all the calculators sold?

QMT554 DATA ANALYSIS 27


Example 11

 If X~N(125,9) and P(X>a), find the value


of a.

QMT554 DATA ANALYSIS 28


Using Explore to Generate Plots (Lab
Session)
 Refer to the data set Insurance Claim which contains
information on the amount of claim made by medical
insurance policy holders of an insurance company in
2004. Also included are the personal information of the
policy holders (gender and age).
 To generate Normal Probability Plot
Analyze>>>Descriptive Statistics>>>Explore…
 Is there any indication of departure from the normal
distribution?

2008 Rasimah Aripin 29


Explore Output: Normal Probability Plot

If the variable is


Normal Q-Q Plot of Amount of claim in RM normally distributed,
the observations (solid
3 circles) are distributed
around the straight line
2
Note the departure
from the straight line at
1
both ends of the line.
Expected Normal

0 To confirm for


departure from
-1 normality, need to test
formally using
-2 Kolmogorov-Smirnov
or Shapiro-Wilk
-3 statistics
5,000 6,000 7,000 8,000 9,000 10,000 11,000
Observed Value

2008 Rasimah Aripin 30


Explore output: Test of Normality

Tests of Normality
a
Kolmogorov-Smirnov Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Amount of claim in RM .052 164 .200* .987 164 .151
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

Based on the Kolmogorov-Smirnov test or the Shapiro-Wilk


test, we can conclude that;

Since the Sig. value (or the p-value) >0.05 (the most
commonly used level of significance), the variable is
normally distributed.

2008 Rasimah Aripin 31

You might also like