Professional Documents
Culture Documents
Dr Asad Ali
1 / 212
Prob Dist
Probability Distributions
165 / 212
Prob Dist
Probability Distributions
In the last chapter we learned that every random variable follows one or the other probability distribu-
tion for which a corresponding pmf or pdf can be established. This actually means that the observations
on that variable are generated according to some mechanism. That mechanism is governed by some
well defined mathematical model, called probability mass function or probability density function. Fur-
thermore, we also learned as to how to calculate the probabilities of different events and probabilistic
statements. Also, we learned that mathematical expectation can be used to deduce some of the proper-
ties of a random variable, such as mean, variance, covariance, moments and correlation etc. Now we are
going to study some real world random variables for which some proper probability distributions were
established.
Some discrete probability distributions
Probability Distributions
167 / 212
Prob Dist
Probability Distributions
Binomial Distribution
Let X denotes the number of successes in a binomial experiment (X is then called a binomial random
variable), the pmf of X is defined as,
!
n x n−x
f (x) = P (X = x) = p q , x = 0, 1, 2, ..., n
x
The terms n, p are called the parameters of the binomial distribution. It’s traditionally denoted by
b(x; n, p).
How to do it in R
In R the binomial probabilities can be calculated in two ways.
1. Using the cumulative probabilities
pbinom (x , n , p ) # Cumulative probability
p ( X = x ) = pbinom (x , n , p ) - pbinom (x -1 , n , p ) # Here you need subtraction
# e.g.
b ( x =2 , n =5 , p =0.25) = pbinom ( x =2 , n =5 , p =0.25) - pbinom ( x =1 , n =5 , p =0.25)
2. Using the exact probabilities
dbinom (x , n , p ) # Exact probability
p ( X = x ) = dbinom (x , n , p )
# e.g.
p ( X =2) = dbinom ( x =2 , n =5 , p =0.25)
Note: These approaches can be applied to many other distributions.
168 / 212
Prob Dist
Probability Distributions
Example 1.
Consider a coin tossing experiment in which the coin is tossed five times. Find the probabilities of
obtaining various numbers of heads.
Solution:
Lets check the properties of this experiment.
1 each toss has two possible outcomes; either a head (success) occurs or a tail (failure).
2 the probability of a success is p = 21 (and hence q = 1 − p = 12 ), and it remains the same for all
trials.
3 the successive trails of the experiment are independent (or the successive tosses are independent).
4 the coin is tossed 5 times.
Thus its a binomial experiment. Let the rv X denotes the number of heads (successes), then it has a
binomial distribution with pmf :
!
x 5−x
5 1 1
f (x) = P (X = x) = , x = 0, 1, 2, ..., 5
x 2 2
169 / 212
Prob Dist
Probability Distributions !
1 5−1 5
5 1 1 1 5
P (one head) = P (X = 1) = =5× =
1 2 2 2 32
!
2 5−2 5
5 1 1 1 10
P (two heads) = P (X = 2) = = 10 × =
2 2 2 2 32
!
3 5−3 5
5 1 1 1 10
P (three heads) = P (X = 3) = = 10 × =
3 2 2 2 32
!
4 5−4 5
5 1 1 1 5
P (four heads) = P (X = 4) = =5× =
4 2 2 2 32
!
5 5−5 5
5 1 1 1 1
P (five heads) = P (X = 5) = =1× =
5 2 2 2 32
The binomial probability distribution of the number of heads obtained in 5 tosses of a coin is
xi 0 1 2 3 4 5
1 5 10 10 5 1
f (xi ) 32 32 32 32 32 32
Using R
dbinom (0:5 , 5 , 0.5)
[1] 0.03125 0.15625 0.31250 0.31250 0.15625 0.03125
170 / 212
Prob Dist
Probability Distributions
Example 2.
The probability of getting caught someone’s else exams is 0.2, find the probability of not getting caught
in three attempts. Assume independence.
Solution:
Here p = 0.2 (q = 1 − p = 0.8) and n = 3. Let X denotes the number of success (getting caught), then
it has binomial distribution with pmf.
!
n x n−x
f (x) = P (X = x) = p q , x = 0, 1, 2, ..., n
x
!
3
= (0.2)x (0.8)3−x , x = 0, 1, 2, 3
x
171 / 212
Prob Dist
Probability Distributions
Example 3.
Let A and B play a game in which A’s probability of winning is 2/3 . In a series of 8 games, what’s the
probability that A will win (i) exactly 4 games (ii) at least 4 games (iii) at most 6 games, and (iv) from
3 to 6 games?
Solution:
We observe that:
1 Each game has two possible out comes: A will win or will not win a game.
Let X denotes the number of games won by A (success) then it has a binomial distribution; b(x; 8, 32 )
with pmf :
!
n x n−x
f (x) = P (X = x) = p q , x = 0, 1, 2, ..., n
x
!
x 8−x
8 2 1
= , x = 0, 1, 2, ..., 8
x 3 3
Probability Distributions
and P (X ≥ 4) = P (X = 4) + P (X = 5) + P (X = 6) + P (X = 7) + P (X = 8)
8 2 4 1 8−4 8 2 5 1 8−5 8 2 6 1 8−6
= + +
4 3 3 5 3 3 6 3 3
8 2 7 1 8−7 8 2 8 1 8−8
+ +
7 3 3 8 3 3
= 0.9121
Note that P (X ≥ 4) = 1 − P (X < 4) (complement). So you can also calculate it this way. Similarly
P (X ≤ 6) = P (X = 0) + P (X = 1) + P (X = 2) + · · · + P (X = 6) (7 terms)
= 1 − P (X > 6) (using the complement rule)
= 1 − [P (X = 7) + P (X = 8)] (only two terms)
" #
8 2 7 1 8−7 8 2 8 1 8−8
=1− +
7 3 3 8 3 3
1280
=1− = 0.8049
6561
and P (3 ≤ X ≤ 6) = P (X = 3) + P (X = 4) + P (X = 5) + P (X = 6)
8 2 3 1 8−3 8 2 6 1 8−6
= + ··· +
3 3 3 6 3 3
5152
= = 0.7852
6561
173 / 212
Prob Dist
Probability Distributions
In the case of frequency table, one can not reconstruct the actualP
frequency table, using the relative
frequencies, exactly, P
unless one knows the total of frequencies ( f ), thus N , in this case, is not
necessarily equal to f . N is just the number of times the entire experiment was performed.
Home work:
Mean and Variance:
Let X be a binomial rv with pmf :
n
f (x) = px q n−x , x = 0, 1, 2, ..., n
x
Find E(X), E(X 2 ) and V (X).
174 / 212
Prob Dist
Probability Distributions
Example 4.
A five dice experiment was repeated 96 times. Find the expected frequencies when getting a 4, 5 or 6 is
considered a success.
Solution:
The probability of getting each 4, 5 and 6 in a single trail is 16 and using the addition law of probability
we observe that P (4 or 5 or 6) = 61 + 16 + 61 = 21 , this is our p. So we now have n = 5, p = 12 and N = 96.
Let X be the rv denoting the success (getting 4, 5 or 6), then X can take 0, 1, 2, 3, 4, 5 values. Thus
the binomial frequency distribution is given by:
!
n x n−x
N · f (x) = N · p q , x 1 5−x
x x 96 x5 12 2
5
1 0 1 5−0
x = 0, 1, 2, ..., n 0 96 0 2 =3
2
1 1 5−1
1 96 51 12
putting the values of the relevant quanti- 2
= 15
ties, i.e. N , n, p and q 2 1 5−2
2 96 52 12
2
= 30
!
x 5−x
5 1 1 3 1 5−3
3 96 53 12
N · f (x) = 96 · , 2
= 30
x 2 2
4 1 5−4
4 96 54 12
x = 0, 1, 2, ..., 5. 2
= 15
5 1 5−5
96 55 12
Now make a 2 column frequency table as 5 2
=3
shown. 175 / 212
Prob Dist
Probability Distributions
This is known as the binomial frequency distribution.
In your intermediate mathematics, you may have seen the expansion of a binomial expression (p + q)n .
When multiplied by N , we get the same thing as calculated in the above table. That is we can get the
same results if we expand
5
1 1
96 + (like N (p + q)n )
2 2
using the binomial theorem. In practice,
n
!
n
X n x n−x
(p + q) = p q
x=0
x
so one just needs to choose the individual components of this sum. Also note that,
n
!
n
X n x n−x
(p + q) = 1 = p q
x=0
x
176 / 212
Prob Dist
Probability Distributions
Properties of Binomial Distribution
For a discrete random variable X having a binomial distribution with parameters n and p
(p = 1 − q)
1 The mean is given by E(X) = µ = np
2 The variance is given by Var(X) = σ 2 = npq or we can also write Var(X) = σ 2 = np(1 − p)
177 / 212
Prob Dist
Probability Distributions
Example 5.
A binomial random variable has mean 12.8 and variance 8.64. Find n and p.
Solution:
We know that the mean of a binomial random variable X is
µ = np (1)
σ 2 = npq (2)
Putting the values and then dividing equation 2 by equation 1 (you can use any order) gives.
σ2 npq 8.64
= =
µ np 12.38
=⇒ q = 0.698
=⇒ p = 1 − q = 1 − 0.698 = 0.302
also
µ 12.38
µ = np =⇒ n = = = 40.9797 ≈ 41
p 0.302
178 / 212
Prob Dist
Probability Distributions
Example 6.
Is it possible to have a binomial distribution with mean 5 and standard deviation 3?
Solution: Lets check it. We need n and p (and/or q). We have
σ2 npq
9
= = =⇒ q = 1.8
µ np
5
q is just a probability; the probability of failure, and it must be between 0 and 1. So it’s not possible to
have a binomial distribution with mean 5 and standard deviation 3.
Example 7.
If X is binomially distributed with mean 3 and variance 2. Find P (X = 7).
Solution: Again, to specify the pmf of X which is binomially distributed, we need its two parameters
n and p. So,
σ2 npq
2 1 µ 3
= = =⇒ q = and µ = np =⇒ n = = =⇒ n = 9
µ np
3 3 p 1
3
We now have n and p so we can specify the pmf of X as following.
!
x 9−x
1 9 1 2
b(x; 9, ) = x = 0, 1, 2, ..., 9
3 x 3 3
Thus P (X = 7) is given as,
!
7 9−7
9 1 2 16
P (X = 7) = = = 0.0073
7 3 3 2187
179 / 212
Prob Dist
Probability Distributions
180 / 212
Prob Dist
Probability Distributions
Example 9.
Fit a binomial distribution to the following data.
x 0 1 2 3 4
f 30 62 46 10 2
Solution:
To fit a binomial distribution to this data we need the actual mean of the data. Here, the largest value
of X is 4 so n = 4.
Now P
fx 0 + 62 + 92 + 30 + 8 192
x̄ = P = = = 1.28
f 150 150
Using the relation x = np −→ 1.28 = 4p we find p = 0.32 and q = 1 − p = 0.68. So the pmf of X can
be specified as,
!
4
b(x; 4, 0.32) = P (X = x) = (0.32)x (0.68)4−x , x = 0, 1, 2, 3, 4.
x
Now what we need is same as we did in construction the binomial frequency table (Example 46). Here
we have N = 150. So we just need a table with columns of x and
!
4
150 · (0.32)x (0.68)4−x
x
List the values of x and find the corresponding P (X = x) and then N · P (X = x).
181 / 212
Prob Dist
Probability Distributions
Here is what we were looking for.
4
(0.32)x (0.68)4−x
x b(x; 4, 0.32) = x
150 · f (x)
4
(0.32)0 (0.68)4−0 = 0.21381376
0 0
32.072064 ≈ 32
4
(0.32)1 (0.68)4−1 = 0.40247296
1 1
60.370944 ≈ 60
4
(0.32)2 (0.68)4−2 = 0.28409856
2 2
42.614784 ≈ 43
4
(0.32)3 (0.68)4−3 = 0.08912896
3 3
13.369344 ≈ 13
4
(0.32)4 (0.68)4−4 = 0.01048576
4 4
1.572864 ≈ 2
The frequencies in the last column are called the expected frequencies whereas the actual frequencies are
called the observed frequencies. We can compare them by listing them together in the following table.
x 0 1 2 3 4
Observed f 30 62 46 10 2
Expected f 32 60 43 13 2
We can see that both the actual and expected frequencies do not differ very much, thus the fit is rather
good. This is just a rough analysis, otherwise the actual analysis of the fit is not that simple. To check
the goodness of the distribution fit, we use different tests in statistical methods.
182 / 212
Prob Dist
Probability Distributions
Probability Tables
In practice, the probabilities can be read from the probability tables that are often available for
most of probability distributions. For example, for binomial distribution a table which lists the
exact or cumulative probabilities associated with various values of X and p can be used to calculate
the required binomial probabilities. Here is an example table truncated to n = 3 only. It gives the
cumulative binomial probabilities, that’s
c
!
X n x n−x
P (X ≤ c) = p q
x=0
x
Probability Distributions
Probability Tables
Another, frequently used, table is the table of exact binomial probabilities. That’s with such tables,
we can calculate P (X = x) directly. Here is an example table truncated to n = 4 only. It gives
!
n x n−x
P (X = x) = p q
x
When n = 3, p = 0.4, the probability of say P (X = 2) is 0.288. These tables are very useful as
they help in finding the probabilities without calculating the cumbersome combinations and powers
of decimal probabilities. However, there is a disadvantage too of these tables. These tables are
available for a very few values of p only. For example, the probabilities corresponding to p = 0.2345
are not listed in the above tables. Probabilities are very sensitive to rounding errors. So try to avoid
these tables when p does not match very well with one listed in these tables.
184 / 212
Prob Dist
Probability Distributions
Poisson Distribution
Many experimental situations occur in which we observe the counts of events within a specified
unit of time, area, volume, length etc. For example,
The number of telephone calls in an hour.
The number of cases of a disease per square kilometer in a specified area
The number of dolphin pod sightings along a flight path through a region
The number of particles emitted by a radioactive source in a given time
The number of births per hour during a given day
The Poisson distribution is a discrete probability distribution for the counts of events that occur
randomly in a given interval of time (or space).
If we let X = The number of events distributed independently in time, occurring in a fixed
time-interval or region.
Then, if the mean number of events per interval is λ.
The probability of observing x events in a given interval is given by
e−λ λx
P (X = x) = , x = 0, 1, 2, 3, 4, ...
x!
Where λ is the only parameter of the distribution, interpreted as, the average number of events in a given
interval/region. For example, the average number of calls per hour. Since it uses ‘per’ therefore it is, in
reality, a rate. Poisson distribution is generally denoted by p(x; λ) or P oisson(λ). Poisson distributed
random variables appear in many astronomical studies and it is thus very important to understand them
well.
In R the probabilities can be found in the same way as binomial. He we use ppois(x, lambda) and
dpois(x, lambda) according to the situation.
185 / 212
Prob Dist
Probability Distributions
Example 10.
If the number of arrivals is 10 per hour on average, determine the probability that, in any hour there
will be
1 0 arrivals;
2 6 arrivals;
Solution:
We see that there is no probability of success p or the number of trials n given. We are only given the
average number of arrivals/hour (the rate of arrivals). Also the underlying variable is a discrete rv, so
we need to use the Poisson distribution to find these probabilities.
We have λ = 10.
Let X is the number of arrivals then P (X = x) is defined as
e−10 10x
P (X = x) = x = 0, 1, 2, 3, 4, ...
x!
Now
e−10 100
(1) P (X = 0) = = 4.539993e − 05
0!
−10 6
e 10
(2) P (X = 6) = = 0.063055
6!
(3) P (X > 6) = 1 − P (X ≤ 6) = 1 − {P (X = 0) + P (X = 1) + · · · + P (X = 6)}
= 1 − 0.1301414 = 0.8698586
186 / 212
Prob Dist
Probability Distributions
Example 11. Home Work
The average rate of telephone calls in a busy reception is 4 per minute. If it can be assumed that the
number of telephone calls per minute interval is Poisson distributed, calculate the probability that
1 at least 2 telephone calls will be received in any minute.
2 any minute will be free of telephone calls.
3 no more than one telephone call will be received in any one minute interval.
Also go through section 3.7 and the exercise in the end i.e. Exercises Section 3.7 (93-109).
e−λ λx
lim b(x; n, p) = , x = 0, 1, 2, ..., ∞
n→∞ x!
where λ = np.
187 / 212
Prob Dist
Probability Distributions
Example 12.
Past experience in the production of certain item has shown that the probability of an item being
defective is p = 0.03. The items depart in boxes of 500. What is the probability that
1 a box contains 3 or more defective;
2 two successive boxes contain 6 or more defectives in them.
Solution:
Let X represents the number of defective items in a box. In reality, it’s a binomial problem because
of the nature of the outcomes (defective or good). However, we see that p = 0.03 is too small and
n = 500 is
P500 too large making it a challenge to use binomial distribution. (evaluating P (X ≥ 3) =
500 x 500−x
x=0 3
(0.03) (0.97) can drive you crazy)
So to find the above probabilities we use Poisson distribution with
λ = np = 500 × 0.03 = 15
188 / 212
Prob Dist
Probability Distributions
1 a box contains 3 or more defective;
To find the probability of 3 or more defectives in a box is:
P (X ≥ 3) = 1 − P (X < 3)
= 1 − [P (X = 0) + P (X = 1) + P (X = 2)]
−15 0
e−15 151 e−15 152
e 15
=1− + +
0! 1! 2!
0 1 2
15 15 15
= 1 − e−15 + +
0! 1! 2!
= 0.99996
2 Similarly, to find the probability of 6 or more defective items in two boxes we first need the
probability of 6 or more defective items in a single box which is:
P (X ≥ 6) = 1 − P (X < 6)
= 1 − [P (X = 0) + P (X = 1) + · · · + P (X = 5)]
−15 0
e−15 151 e−15 155
e 15
=1− + + ··· +
0! 1! 5!
0 1 5
15 15 15
= 1 − e−15 + + ··· +
0! 1! 5!
= 0.9972076
189 / 212
Prob Dist
Probability Distributions
Thus the probability of having 6 or more defective items is 0.9972076 for all individual boxes.
Assuming that the two boxes were filled independently, we can calculate the probability of 6 or
more defective items in two boxes as under.
P (X ≥ 6 in two boxes) = P (X ≥ 6) × P (X ≥ 6)
= (0.9972076)2
= 0.994423
Poisson frequency distribution
Like binomial distribution, we can find the Poisson frequency distribution by multiplying the Poisson
probability distribution by N , the number of experiments done.
It’s defined as:
e−λ λx
f (x) = N · Poisson(λ) = N · , x = 0, 1, 2, ..., ∞
x!
That’s just multiply the actual Poisson probabilities corresponding to x = 0, 1, 2, 3, ..., by a given
N. The method is same as was explained in example 46.
Fitting Poisson distribution to observed data
We
Pn can also fit a Poisson distribution to a given data. One first estimates λ by calculating x̄ =
fx
Px=0
n of the given data and then using the possible values of x = 0, 1, 2, 3, ... (ignoring the
x=0 f
frequencies) to find the corresponding probabilities. Then multiply those probabilities by n
P
x=0 f
or N to get a Poisson frequency distribution like thing. Go through, examples 46 and 51 in the
previous slides. See also, example 8.20/page 317 in Sher Muhammad Chaudhry’s book.
190 / 212
Prob Dist
Probability Distributions
Properties of Poisson distribution
1 Mean: The mean of a Poisson rv X is E(X) = λ.
2 Variance: The variance of a Poisson rv X is V (X) = E(X 2 ) − [E(X)]2 = λ.
Interesting! The mean and variance of Poisson random variable are same....
191 / 212
Prob Dist
Probability Distributions
r(1 − p) r(1 − p)
E[Xr ] = , V ar(Xr ) =
p p2
and X is said to have a negative binomial distribution with parameters r, p. A special case
arises when r = 1, the distribution is known as the geometric distribution, that is one has to
repeat the experiment until the first and only success occurs. The resulting pmf is then written as,
P (X = x) = geo(x; p) = p(1 − p)x , for x = 1, 2, . . . ,
192 / 212
Prob Dist
Probability Distributions
Some continuous probability distributions
The uniform distribution
We know that a continuous rv assumes any value in a given interval. A continuous rv X, defined
over an interval [a, b], is said be uniformly distributed if all its values have an equal probability.
It’s probability density function is defined as,
1 , a<x<b
f (x) = b − a
0, elsewhere.
Where a and b are the two parameters of the distribution. In other words, a random variable is
uniformly distributed whenever the probability is proportional to the length of the interval. It’s also
called rectangular distribution because its density looks like a perfect rectangle (see figure below).
The notations for uniform distribution are U (a, b) or Uniform(a, b).
b −a
1
a b
x
Rb 1
The area within this rectangle is unity, that’s a b−a
dx = 1.
193 / 212
Prob Dist
Probability Distributions
194 / 212
Prob Dist
Probability Distributions
Properties of uniform distribution
1 T heM ean : The mean of a uniform rv X is obtained as:
Z b Z b
1
µ == E(X) = xf (x)dx = x dx
a a b−a
Z b
1
= xdx
b−a a
2 b
1 x
=
b−a 2 a
b2 − a2 −
(b a)(b + a)
= =
2(b − a) 2
(b −a)
b+a
=
2
2 The Variance: The variance of X is given by V (X) = E(X 2 ) − [E(X)]2 .
Z b Z b
1
Now, E(X 2 ) = x2 f (x)dx = x2 dx
a a b − a
Z b b
x3
1 2 1
= x dx =
b−a a b−a 3 a
b3 − a3 a)(b2 + ab + a2 )
−
(b
= =
3(b − a) 3 −
(b a)
195 / 212
Prob Dist
Probability Distributions
b2 + ab + b2
So, E(X 2 ) =
3
Thus, V (X) = E(X 2 ) − [E(X)]2
b2 + ab + b2 (b + a)2
= −
3 4
(b − a)2
=
12
196 / 212
Prob Dist
Probability Distributions
Beta Distribution
The Beta Distribution We often need to model the distribution of a proportion (i.e., 0 < X < 1),
where X is a continuous random variable. The beta distribution is often used in this framework.
X is said to have a beta distribution with parameters α, β, A, and B if the pdf of X is:
α−1 β−1
1
Γ(α + β) x − A B−x
, A≤x≤B
f (x; α, β , A, B) = B − A Γα Γβ B−A B−A
0, otherwise
The case A = 0 and B = 1 gives the standard beta distribution which is commonly used instead.
197 / 212
Prob Dist
Probability Distributions
Gamma Distribution
Another widely used distributional model for skewed data is the gamma family of distributions.
A continuous random variable X is said to have a gamma distribution if the pdf of X is:
1 xα−1 exp − x , x ≥ 0
f (x; α, β) = β α Γα β
0, otherwise
E(X) = α β
V (X) = αβ 2
198 / 212
Prob Dist
Probability Distributions
Exponential Distribution
1
A special case of the gamma distribution is the exponential distribution. Taking α = 1 and β = λ
gives the exponential pdf:
(
λ exp (−λx) , λ > 0, x ≥ 0
f (x; λ) =
0, otherwise
199 / 212
Prob Dist
Probability Distributions
Chi-Squared Distribution
The chi-squared distribution is another member of the gamma family with α = ν2 and β = 2. A
random variable X has a chi-squared distribution with ν degrees of freedom, denoted X ∼ χ2 , if it
has pdf:
1 ν −1
x
ν
ν
x 2 exp − , x ≥ 0, ν = 1, 2, 3, . . . ,
f (x; ν) = 2 2 Γ 2 2
0, otherwise
The mean and variance of X ∼ χ2ν are
E(X) = ν
V (X) = 2ν
The chi-squared distribution is widely used in statistical inference. In fact, if X ∼ N (0, 1), then
X 2 ∼ χ2 . The most commonly used procedure in inference which involves the chi-squared
distribution is the Pearson goodness-of-fit statistic:
k
X (Oi − Ei )2
i=1
Ei
where Oi = observed data and Ei = expected data (after fitting a chosen distribution to observed
data and finding expected frequencies as was done for binomial and poisson distributions).
200 / 212
Prob Dist
Probability Distributions
The Normal Distribution (The father of all distributions)
The normal probability distribution is the most important distribution for describing a continuous
random variable. It is widely used in statistical inference (testing of hypothesis: next chapter). It
has a bell-shaped probability density function, known as the Gaussian function or informally the
bell curve. It’s usually denoted by N (µ, σ 2 ). Let X be a normal rv, then its pdf is given as:
1 x − µ !2
−
1
e 2 σ
f (x) = √ , −∞ ≤ x ≤ +∞
σ 2π
f (x )
0, elsewhere.
201 / 212
Prob Dist
Probability Distributions
The effects of changing µ and σ on normal curve.
1.0
µ=2
σ = 0.5
0.8
0.6
f (x )
µ=4
σ=1
0.4
µ=8
σ=2
0.2
0.0
0 5 10 15
x
Changes in µ cause changes in the location of the density and changes in σ cause changes in the
spread of the density. Therefore, the parameters µ and σ are also called location and scale
parameters respectively.
202 / 212
Prob Dist
Probability Distributions
The cumulative distribution function of normal rv X is given as:
1 x − µ !2
Z x −
1
F (x) = P (X ≤ x) = √ e 2 σ dx
σ 2π −∞
1.0
0.8
0.6
F (X )
0.4
0.2
0.0
16 18 20 22 24
X
CDF of Normal rv.
203 / 212
Prob Dist
Probability Distributions
Standard normal distribution
It’s the thing that you, actually, will be using to find the probabilities associated with a normal
random variable. The normal distribution is determined by the values of the two parameters µ and
σ. Looking at the ranges of these two parameters (i.e. −∞ ≤ µ ≤ +∞) and σ ≥ 0), we can see that
there can be an infinite number of normal distributions. Unfortunately, it’s not possible to process
a normal rv in it’s actual shape. The integration of the normal density is not very straight forward.
Also, for each of these distributions we need a separate probability table, which is an impossible job.
So, we use a standardized version in which the mean and standard deviations are fixed constants
(µ = 0, σ = 1) for all normal rv’s, so that the integration can be easier and also so that there be
only one probability table which can be used for all normal rv’s. This standardization is, in reality,
a transformation performed over the term X−µ σ
in the exponent of the normal density function.
X −µ
Z=
σ
where X is a normal rv. The pdf of Z is then,
1 z2
φ(z) = √ e− 2 , −∞ ≤ z ≤ +∞
2π
It has zero mean and unit variance. A cumulative probability table, which is constructed for various
values of Z using the following cdf :
Z z
1 z2
Φ(z) = P (Z ≤ z) = √ e− 2 dz
2π −∞
204 / 212
Prob Dist
Probability Distributions
Standard Normal Cumulative Probabilities (Using half of the density)
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0 z 0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
The above table is constructed using the fact that the normal density is symmetric therefore the
sign of a value of Z does not matter. That’s Z = −1.6 and Z = +1.6 are equivalent. The only
care one needs is to add/subtract a 0.5 (half area under the normal curve) when the value of Z
is positive/negative depending on the direction of the inequality. For example, P (Z ≤ −1.6) and
P (Z ≤ 1.6) are depicted as under.
0.6
0.6
0.5
0.5
P(Z <= −1.6)= 0.5 − P(0 <= Z <= 1.6) P(Z <= 1.6)= 0.5 + P(0 <= Z <= 1.6)
0.4
0.4
f (z )
f (z )
0.3
0.3
0.2
0.2
0.5 0.5 0.5 0.5
0.1
0.1
0.0
0.0
−4 −2 0 2 4 −4 −2 0 2 4
z z
205 / 212
Prob Dist
Probability Distributions
In other words, this table assumes that the total area under the normal curve is 0.5 rather than 1
and one has to find the area from 0 to z rather than from −∞ to z.
A few examples will clear the ambiguities.
In the following, look at the first figure, we are trying to find P (Z ≤ −1.65). Since the table lists only
positive values of Z and since the normal curve is symmetric therefore we can mirror the required
area to the right of the cure, in which case it becomes P (Z ≥ 1.65). As we also know that the tabled
probabilities are actually cumulative probabilities therefore we need to subtract P (Z ≤ 1.65) from
0.5 to get P (Z ≥ 1.65), which if mirrored to the left side again will become P (Z ≤ −1.65).
0.6
0.6
0.5
0.5
P(−1.65 <= Z) = 0.5 − P(0 <= Z <= 1.65) P(0.6 <= Z <= 1.67) = P(0 <= Z <= 1.67) − P(0 <= Z <= 0.6)
0.4
0.4
f (z )
f (z )
0.3
0.3
0.2
0.2
0.5 0.5 0.5 0.5 R equi red
0.1
0.1
R equi red M i rrored
0.0
0.0
−4 −2 0 2 4 −4 −2 0 2 4
z z
The second figure depicts P (0.6 ≤ Z ≤ 1.65). Which can be written as
P (0.6 ≤ Z ≤ 1.65) = P (0.0 ≤ Z ≤ 1.65) − P (0.0 ≤ Z ≤ 0.6).
It can now be easily found from the table.
206 / 212
Prob Dist
Probability Distributions
Example 13.
Let X ∼ N (50, 25). Find P (0 ≤ X ≤ 40), P (55 ≤ X ≤ 100), P (X ≥ 54) and P (X ≤ 57). Solution: For
each probability we need to convert the X values to Z values using:
X −µ
Z=
σ
√ X − 50
We have µ = 50 and σ = 25 = 5 thus Z = .
5
0−50
For X = 0, Z = 5
= −10.0 and similarly for X = 40, Z = -2
So,
0.6
P (0 ≤ X ≤ 40) = P (−10 ≤ Z ≤ −2)
0.5
= P (−10 ≤ Z ≤ 0) − P (−2 ≤ Z ≤ 0)
0.4
= P (0 ≤ Z ≤ 10) − P (0 ≤ Z ≤ 2)
f (z )
0.3
= 0.5 − 0.4772 = 0.0228
0.2
0.1
Note: Most of the normal tables list the
0.0
probabilities only upto Z = 3.0 or 3.5. −4 −2 0 2 4
z
The area under the curve for Z values
greater than 3.0 (or 3.5) is assumed 0.5.
207 / 212
Prob Dist
Probability Distributions
55 − 50
For X = 55, Z = = 1 and for X = 100, Z = 10.
5
So,
0.5
0.4
P (55 ≤ X ≤ 100) = P (1 ≤ Z ≤ 10)
0.3
f (z )
= P (0 ≤ Z ≤ 10) − P (0 ≤ Z ≤ 1)
0.2
= 0.5 − 0.3413 = 0.1587
0.1
0.0
−4 −2 0 2 4
z
0.5
54 − 50
For X = 54, Z = = 0.8. So,
0.4
5
0.3
f (z )
P (X ≥ 54) = P (Z ≥ 0.8)
0.2
0.1
= 0.5 − P (0 ≤ Z ≤ 0.8)
0.0
= 0.5 − .2881 = 0.2119 −4 −2 0
z
2 4
0.5
57 − 50
0.4
For X = 57, Z = = 1.4. So,
5
0.3
f (z )
0.2
P (X ≤ 57) = P (Z ≤ 1.4)
0.1
= 0.5 + P (0 ≤ Z ≤ 1.4)
0.0
−4 −2 0 2 4
z
= 0.5 + 0.4192 = 0.9192
208 / 212
Prob Dist
Probability Distributions
2 The mean: The mean, median and mode of a normal distribution are all equal to µ
3 The variance: The variance of any distribution is given by V (X) = E(X 2 ) − [E(X)2 ] = σ 2
4 The Mode and the Median:
The mode and median of a normal distribution are also equal to µ. The mode is defined as
d d2
Mode = f (x) = 0 if f (x) < 0.
dx dx2
Whereas, the median is derived as the solution of the following expression for m.
Z m −1 x − µ
!2
1 1
= √ e 2 σ dx
2 σ 2π −∞
Both of the above expression give µ.
5 The Mean deviation:
The mean deviation (MD) of a normal distribution is approximately 4 of its standard deviation.
That’s
4
M D = σ.
5
209 / 212
Prob Dist
Probability Distributions
Example 15.
Suppose the diameter at certain height of trees (in inches) of a certain type is normally distributed with
mean 8.8 and standard deviation 2.8, based on data in a 1997 article in the Forest Products Journal.
1 What is the probability that the diameter of a randomly selected tree of this type will exceed 10
inches?
2 What is the probability that the diameter of a randomly selected tree of this type will be between
5 and 10 inches?
Solution:
Here X ∼ N (8.8, 2.82 ), So
X −µ X − 8.8
Z= =
0.4
σ 2.8
0.3
Now
f (z )
0.2
P (X > 10.0) =?
0.1
10.0 − 8.8
P (X > 10.0) = P Z>
2.8
0.0
= P (Z > 0.43) −4 −2 0
z
2 4
= 0.5 − P (0 ≤ Z ≤ 0.43)
= 0.5 − 0.1664 = 0.3336
210 / 212
Prob Dist
Probability Distributions
2 P (5.0 ≤ X ≤ 10.0) =?
0.4
5.0 − 8.8 10.0 − 8.8
0.3
P (5.0 ≤ X ≤ 10.0) = P ≤Z≤
2.8 2.8
f (z )
0.2
= P (−1.36 ≤ Z ≤ 0.43)
0.1
= 0.4131 + 0.1664 = 0.5795
0.0
−4 −2 0 2 4
z
Sketching densities with shaded areas are of great help as it makes it easier to figure out as to where
exactly are the probability areas located and what exactly we need to find.
Example 16.
The mean height of soldiers is 68.22 in. with a variance of 10.8 in.2 . Assuming that the heights are
normally distributed, how many soldiers in regiment of 1000 would you expect to be over 6 ft. (72 in.)?
Solution:
Let X denotes the heights, then X =height∼ N (68.22, 10.8). Hence Z = X−68.22
√
10.8
.
0.4
72.0 − 68.22
P (X > 72.0) = P Z> √
0.3
10.8
f (z )
0.2
= P (Z > 1.15)
0.1
= 0.5 − P (0 ≤ Z ≤ 1.15)
0.0
= 0.5 − 0.3749 = 0.1251 z
−4 −2 0 2 4
Now, out of 1000 soldiers 1000 × 0.1251 = 125 are expected to have their height exceeding 6 ft.
211 / 212
Prob Dist
Probability Distributions
Normal approximation to binomial.
Let X ∼ Binomial(n, p), with mean np and standard deviation npq. Now, if p is close to 21 (any
departure from p = 12 results in a skewed Binomial histogram), and n is sufficiently large, such that
both np and nq ≈ 10 then we can define the following variable.
X ± 0.5 − np
Z= √ ∼ N (0, 1)
npq
The ±0.5 is used to make X a continuous rv because, in reality, it’s a discrete rv.
212 / 212