You are on page 1of 28

Biostatistics

The Normal Probability Distribution


By Ibrahim Ishag
The Normal Probability Distribution

 The normal distribution is the most widely known and


used of all distributions.
 Because the normal distribution approximates many
natural phenomena so well, it has developed into a
standard of reference for many probability problems.
Probability Density Function:
 Is an equation that is used to compute probabilities of
continuous random variables that must satisfy the
following two properties. Let f(x) be a probability
density function.

 Where: x is the variable, μ is the mean, σ is the standard


deviation
 Two parameters, μ and σ. Note that the normal distribution
is actually a family of distributions, since μ and σ determine
the shape of the distribution
Normal distribution curve
If a continuous random
variable is normally
distributed or has a normal
probability distribution,
then a relative frequency
histogram of the random
variable has the shape of a
normal curve (bell-shaped
and symmetric).

Normal Distribution Curve


Characteristics of the Normal distribution:

 Symmetric about its mean (μ), bell shaped


 The highest point of curve occur at x=μ
 Continuous for all values of X between -∞ and ∞ so that each
conceivable interval of real numbers has a probability other
than zero (-∞≤ X ≤∞)
 The area under curve= 1, that is =  f ( x)dx  1
xS

 The notation N(μ, σ2) means normally distributed with mean


μ and variance σ2. If we say X ∼ N(μ, σ2) we mean that X is
distributed N(μ, σ2).
 About 2/3 of all cases fall within
one standard deviation of the
mean, that is P(μ - σ≤ X ≤ μ + σ)
= .6826.
 About 95% of cases lie within 2
standard deviations of the mean,
that is P(μ - 2σ≤ X ≤ μ + 2σ) = .
9544
 About 99% of cases lie within 3
standard deviations of the mean,
that is P(μ - 3σ≤ X ≤ μ + 3σ) = .
9974
Why is the normal distribution useful?
 Many things actually are normally distributed, or very
close to it. For e.g. height and intelligence are
approximately normally distributed.
 The normal distribution is easy to work with
mathematically.
 There is a very strong connection between the size of a
sample N and the extent to which a sampling distribution
approaches the normal form.
The standardized normal distribution:
 As you might suspect from the formula for the normal density
function, it would be difficult and tedious to do the calculus
every time we had a new set of parameters for μ and σ.
 So instead, we usually work with the standardized normal
distribution, where μ = 0 and σ = 1, i.e. N(0,1).

 That is, rather than directly solve a problem involving a


normally distributed variable X with mean μ and standard
deviation σ, an indirect approach is used.
Properties of standard Normal distribution
curve
 Symmetric about its mean, μ=0
 The highest point of curve occur at z=0
 The area under curve= 1
 The Empirical Rule:
o About 68% of the area under the graph is between -1 and 1;
o about 95% of the area under the graph is between -2 and 2;
o about 99.7% of the area under the graph is between -3 and 3
Finding the Area Under the Standard Normal Curve

 EXAMPLE: Find the area under the standard normal


curve to the left of Z = -0.38.
Solution: Look in the Normal distribution table (B):
P(x<-0.38) = 0.3520
 Example: Find the area under the standard normal
curve to the right of Z = 1.25.
Solution: Look in the Normal distribution table (A):
P(x≥1.25) =1-P(x<1.25)=1-0.8944=0.1056
 Area under the normal curve to the right of zo = 1 – Area to the left of zo
 Example: Find the area under the standard normal
curve between Z = -1.02 and Z = 2.94.
 Solution: P(-1.02<x<2.94)=P(x<2.94)-p(x<-1.02)
=0.9984-0.1539 =0.8445
Normal
distribution table
(A)
Normal
distribution
table (B)
Finding a Z-score
From a specified area to the left:
 Example: Find the Z-score such that the area to the
left of the Z-score is 0.68.
 Solution: find Z such that P(x<Z)=0.68 , Z =0.46
From a specified area to the right:
 Example: Find the Z-score such that the area to the
right of the Z-score is 0.3021.
 Solution: P(x>Z) =0.3021, so P(x<Z) = 0.6979, z=0.52
Tutorials:
 1. Find each of the following probabilities:
(a) P (Z < -0.23)
(b) P (Z > 1.93)
(c) P (0.65 < Z < 2.10)
 2. The weights of pennies minted after 1982 are
approximately normally distributed with mean 2.46 grams
and standard deviation 0.02 grams. Find the area under the
normal curve (probability) between 2.44 and 2.49 grams?
 3. The random variable X is normally distributed with mean=800 and
standard deviation= 40, find the area between X1=778 and X2= 834.
 Solution: where X~(800, 1600), we convert the X to Z (standard Normal
distribution) by
P (X1≤ X ≥X2) = P (Z1≤ Z ≥Z2)
Z1= X1-μ/ = 778-800/40=-22/40= -0.55
Z2= X2-μ/ = 934-800/40= 34/40= 0.85
P (-0.55 ≤ Z ≥ 0.85), using standard normal distribution tables
P (-0.55 ≤ Z ≥ 0.85)= 0.8023 – 0.2913 = 0.511
  
4.
Suppose that the random variable X is normally distributed
with μ=100 and = 16, find the value of X the area under
curve left of it=0.04.
 Solution:
Firstly: search in the standard normal distribution tables, the z
value for area=0.04 ; Z= -1.75
Secondly: by formula
- 1.75 = x= 100 – 1.75*16= 100-28= 72
 The notation z is z
value such that the area
under the standard
normal curve to right is
. The figure illustrate
the notation.
Estimation of a population mean
 The most fundamental point and interval estimation process
involves the estimation of a population mean.
 Suppose it is of interest to estimate the population mean, μ, for a
quantitative variable.
 Data collected from a simple random sample can be used to
compute the sample mean, x̄, where the value of x̄ provides a
point estimate of μ.
 When the sample mean is used as a point estimate of the
population mean, some error can be expected
 The absolute value of the difference between the sample mean, x̄,
and the population mean, μ, written |x̄ − μ|, is called the
sampling error.
 Statisticians have shown that the mean of the sampling distribution
of x̄ is equal to the population mean, μ, and that the
standard deviation is given by σ/√n, where σ is the population
standard deviation (standard error).
 For large sample sizes, the central limit theorem indicates that the
sampling distribution of x̄ can be approximated by a normal
probability distribution.
 As a matter of practice, statisticians usually consider samples of
size 30 or more to be large.
 In the large-sample case, a 95% confidence interval estimate for the
population mean is given by x̄ ± 1.96σ/√n.
 When the population standard deviation, σ, is unknown, the sample
standard deviation is used to estimate σ in the confidence interval
formula.
 The quantity 1.96σ/√n is often called the
Confidence
Z
margin of error for the estimate.
Interval
80% 1.282
 By changing the constant from 1.96 to 1.645, a 90% confidence
85% 1.440 interval can be obtained. It should be noted from the formula for an
90% 1.645 interval estimate that a 90% confidence interval is narrower than a
95% 1.960
95% .
99% 2.576
99.5% 2.807  In practice, a 95% confidence interval is the most widely used.
99.9% 3.291
 Example: We measure the heights of 40 randomly chosen men, and get
a mean height of 175cm, We also know the standard deviation of men's
heights is 20cm. Find true mean with The 95% Confidence Interval.
 Solutions: use that Z value of 1.96 in the following formula for the
Confidence Interval X  ±  Zs√n, Where:
X is the mean, Z is the chosen Z-value from the table above
s is the standard deviation, n is the number of observations
And we have: 175 ± 1.960 × 20√40, Which is: 175cm ± 6.20cm
In other words: from 168.8 cm to 181.2 cm.
 Example of Apple Orchard: There are hundreds of apples on
the trees, so you randomly choose just 46 apples and get: a
Mean of 86 and a Standard Deviation of 6.2, find the apple
mean with confidence interval of 95%?
 Solution: So let's calculate: X  ±  Zs√n
 We know: X is the mean = 86, Z is the Z-value = 1.960 (from
the table above for 95%), s is the standard deviation = 6.2, n is
the number of observations = 46
 86 ± 1.960 × 6.2√46 = 86 ± 1.79
 So the true mean (of all the hundreds of apples) is likely to be
between 84.21 and 87.79
 Example: A random sample of 30 students at a Florida
college has the following grade point averages: 59.1, 65.0,
75.1, 79.2, 95.0, 99.8, 89.1, 65.2, 41.9, 55.2, 94.8, 84.1,
83.2, 74.0, 75.1, 76.2, 79.1, 80.1, 92.1, 74.2, 59.2, 64.0,
75.1, 78.2, 95.0, 97.8, 89.1, 64.2, 41.8, 57.2. What is the
90% confidence interval for the population mean?
 Solution:
 The procedure just described for developing interval estimates of a population mean is
based on the use of a large sample. In the small-sample case—i.e., where the sample size n
is less than 30—the t distribution is used when specifying the margin of error and
constructing a confidence interval estimate.

You might also like