You are on page 1of 37

Normal Distribution and Normal

Curve
Normal Distribution and Normal Curve

Normal Distribution
In biostatistics most of our understanding depends upon the
knowledge of normal curve and normal distribution. Also known as
curve of error, Gaussian curve, bell shaped curve etc.

When we take large number of observations of any variable such


as Height, Blood Pressure, Pulse rate etc. and we draw a graph, we
will get normal curve.
Following observations were made on fairly large number
of normal persons selected at random.

Observations Range Mean SD


Age at menarche (yrs) 9-18 yrs 14.16 yrs 1.13 yrs
Pulse rate/minute 60-100 72 3.5
Birth Weight (Kg) 1.8-4.5 3.05 0.39
Systolic BP (mm Hg) 100-140 115 12.0
When a large number of observations of any variable characteristics
such as height, blood pressure and pulse rate are taken at random, a
frequency distribution table is prepared by keeping group interval
small, than it will be seen that:
a). Some observations are above the mean, and others are below the
mean
b). If they are arranged in order, maximum number of frequencies
will be seen in the middle around the mean and fewer at the
extremes, decreasing smoothly on both sides
c). Normally, half of observations lie above and half below the mean
and all observations are symmetrically distributed on each side of
the mean.
Table 1. Height of 1000 subjects
Height (Cm) Number of
subjects
142.5-145.0 03
145.0 08
147.5 15
150.0 45
152.5 90
155.0 155
157.5 194
160.0 195
162.5 136
165.0 93
167.5 42
170.0 16
172.5 06
175.0-177.5 02
Histogram of 100 Heights

250

200
Frequency

150

100

50

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Height groups (1 to 14)
250
Number of subjects

200

150

100

50

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14

Height Groups (1st to 14th)


Normal Curve
Histogram of the same frequency distribution of
heights, with large number of observations and small class
interval, gives a frequency curve which is symmetrical in
nature. This is called normal curve.

The frequency distribution is symmetrical around a


single peak, so that mean, median and mode will coincide.

It build up gradually from the smallest frequencies


at the extremes of classification to the highest frequencies
at the peak in the middle.
Characteristics of normal curve
• It is bell shaped (smooth)
• It is symmetrically distributed
• Maximum height is at mean
• Mean, Median and Mode coincides
• It has two inflections. Central part is convex, when come
down, it becomes concave both sides.

A perpendicular from the point of inflection will cut


the base at a distance of one SD from the mean on either
side.
Standard Normal Curve

Standard normal curve is a smooth, bell-shaped


perfectly symmetrical curve, based on an infinitely large
number of observations.

The total area of the curve is 1; its mean is zero, and


its standard deviation is 1. The mean, median and mode
coincide.
a). Mean ± 1 SD limits includes 68.27% or roughly 2/3 of all
observations.
b). Mean ± 2 SD limits, includes 95.45% of observations.
Mean ± 1.96 SD limits includes 95% of all observations.

c). Mean ± 3 SD limit includes 99.73% of observations. Mean ± 2.58


SD limits include 99% observations.

In other words, in any normal distribution, if it is found that:


a). Values that differ from the mean by more than twice the SD are
rare, only 4.55%. their chance of being normal is only
4.55%.
b). Values higher or lower than mean ± 3 SD are 0.27%. Their chance
of being normal is 0.27 in 100. It is abnormal or pathogenic.
Height (cm) f
142.5 – 145.0 3
145.0 – 147.5 8
147.5 – 150.0 15
150.0 – 152.5 45 995
152.5 – 155.0 90
155.0 – 157.5 155 950
157.5 – 160.0 194
160.0 – 162.5 195 680
162.5 – 165.0 136
165.0 – 167.5 93
167.5 – 170.0 42
170.0 – 172.5 16
172.5 – 175.0 6
175.0 – 177.5 2

Total 1000
NORMAL DISTRIBUTION
Parameters : Mean and Standard deviation (S.D)
The mean specifies the location and the s.d. specifies
the spread of the distribution
Hence, for different values of mean or s.d. or both, we
get different Normal distributions
However, every Normal distribution can be standardized
in terms of a quantity called the Normal deviate, which
is
defined as
Observation - Mean
Z = -------------------------------
Standard deviation
The distance of a value (x) from the mean (X bar)
of the curve in units of standard deviation is called
“relative deviate or standard normal deviate” and usually
denoted by Z.
Z = Observation –Mean
Standard Deviation

Thus, Z is standardized variable. The new variate


“z” like the variate “x” also follows a normal distribution.
The mean of the transformed distribution is zero and the
standard deviation (SD) is 1 .
Estimation of probability
Example: Mean pulse rate is 72 and standard deviation is 3. What is the
probability of a persons having pulse rate of 80 or more?

Mean= 72, SD= 3


Z = Observation –Mean
Standard Deviation
= 80-72 = 2.667
3
This corresponds to 0.0039 as per table. The percentage of
persons having pulse rate of 80 or more will be 0.39% only.
Examples

Average weight of baby at birth is 3.05 kg with SD of 0.39kg. If


the birth weights are normally distributed would you regard:

Weight of 4 kg as abnormal?
Weight of 2.5 kg as normal?
USE OF NORMAL DISTRIBUTION
Example :

Mean height = X = 65"


Standard deviation = SD = 2"

a) Proportion of persons whose height


exceeds 68"
X- X 68-65
Normal deviate = Z = = = 1.5
SD 2
Area Under Curve (AUC) Normal
from Z = 1.5 } = 6.68%
= 0.06681

(height exceeds 68")


b) Proportion of persons whose height is less than 60"

X- X
Normal deviate = Z=
SD
= (60 - 65 ) / 2
= - 2.5

(height less than 60") = 0.00621


= 0.6 %
c) Proportion of persons whose height is in between 64 " & 67 "

64 - 65
Normal deviate ( X=64") = Z1 = ----------- = - 0.5
2
(height less than 64”) = 0.30854

67 - 65
Normal deviate ( X=67") = Z2 = ----------- = 1
2

(height more than 67“) = 0.15866

Now,
(heights between 64" & 67’’)
= 1 - 0.30854 - 0.15866 = 0.5328 = 53.28%
Example :

Mean cholesterol = 242 mg%; S.D. = 45 mg%


What is the cholesterol level from which 10% subjects will
have higher cholesterol value ?
Example : Mean cholesterol = 242 mg%; S.D. = 45 mg%

We have to find the Z corresponding to an area of 10% (0.1)


on the right. The approximate Z value from the table is 1.3

X -X
 ------- = Z
SD

X - 242
------------ = 1.3
45

X - 242 = 1.3 x 45 = 58.5

X = 58.5 + 242 = 300.5 mg%


Example 1:
Mean height of 500 students is 160cm and the SD is 5 cm.
a).What are the chances of height above 175cm being
normal if height follows normal distribution?
b).What percentage of boys will have height above 168 cm.
c).How many of the boys will have height between 168 and
175 cm?
Example 2:
The pulse rate of healthy males follows a normal distribution
with a mean of 72/min and a SD of 3.5/min. In what percentage of
individuals, pulse rate will differ by 2 beats from the mean?
Example 3: In a survey, Mean cholesterol among >45 year
subjects was 158mg% and SD was 26mg%. Find the cholesterol
level among 2.5% subjects will have equal or more than this level ?
How many subjects will have cholesterol equal to this level or more
among 20 crore Indian population aged >45 years?
1 (a) 0.13% , (b) 5.48%, (c) 26.75 = 27

2 56.86%,
Any questions ?
Thanks

You might also like