You are on page 1of 57

Chapter 4

Probability and Probability Distributions

03/09/2021 Tesfa S 1
Objectives
At the end of the session students should be able to:

• Define and calculate probabilities

• Understand the different properties of probability

• Recognize common probability distributions

03/09/2021 Tesfa S 2
Introduction
Probability
 It is the likelihood or chance of an event will
occur
 if the same experiment is repeated a very
large or infinite number of independent trials
 What does mean independent trials?
 If the outcome of one experiment doesn’t
affect any other outcomes.

03/09/2021 Tesfa S 3
Introduction
 The Classical Probability Concept:
 If there are n equally likely possibilities and m
are considered as favourable or success
outcome of experiment, then the probability
of a success is m/n.
E.g.: What is the probability of rolling a 6 with a
well-balanced die?
In this case, m=1 and n=6, so that the
probability is 1/6 = 0.167

03/09/2021 Tesfa S 4
Introduction
 Equally Likely Outcomes Rule
 If all possible outcomes from a random
process have the same probability, then
 P(A) = (# of outcomes in A)/(# of outcomes in S)
E.g.: if One Dice Tossed, what is the probability
of getting even number
P(even number) = |2,4,6| / |1,2,3,4,5,6|

03/09/2021 Tesfa S 5
Introduction
E.g.: A couple wants to have exactly 3 children.  Assume that
there are no twin births.

 Find the probability that two of them will be boys?

 Possible Orderings:

Solution: S = {BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG}

Then the event E={BBG, BGB, GBB}

P(E)=3/8 = 0.375

03/09/2021 Tesfa S 6
Definitions of common terms in probability

Experiment
 any well defined situation or procedure that
results in one or more possible outcomes.
E.g.: tossing a coin, rolling a die, foot ball match, etc.
Outcome
 It is a result of an experiment.
E.g.: getting either head or tail in tossing a coin.
Winning, loosing or draw in foot ball match

03/09/2021 Tesfa S 7
Definitions of common terms in probability

Sample space
 it is a complete list of all possible outcomes
of an experiment.
Events
 An event is a specific collection of basic
outcomes.

03/09/2021 Tesfa S 8
Mutually exclusive events and the additive law

 If two outcomes of an experiment cannot


happen at the same time or have no common
elements
 That is, the occurrence of one event precludes
the occurrence of another and vice versa.
 E.g., getting either head or tail in tossing a coin
are mutually exclusive.

03/09/2021 Tesfa S 9
Additive law

 The additive law for tow events (A or B) is:


 Pr (A or B) = Pr (A) + Pr (B)- Pr (A ∩ B)
 Since pr(A ∩ B) = 0, the additive law for
mutually exclusive events becomes
 Pr (A or B) = Pr (A) + Pr (B)

03/09/2021 Tesfa S 10
Additive law

Example:
 Among 200 seniors at a certain college, 98 are
women, 34 are majoring in Biology, and 20
Biology majors are women. If one student is
chosen at random from the senior class, what
is the probability that the choice will be either
a Biology major or a woman.
 Pr ( Biology major or woman ) = Pr (Biology
major) + Pr(woman ) - Pr (Biology major and
woman) = 34/200 + 98/200 - 20/200 =
112/200 = 0.56
03/09/2021 Tesfa S 11
Combinations of Events

 The union of two events A and B is the event that


either A or B or both occurs
 The intersection of two events A and B is the event
that both A and B occur

Event A Complement of A Union of A and B Intersection of A and B

03/09/2021 Tesfa S 12
Summary of addition rule

03/09/2021 Tesfa S 13
Conditional probability and multiplicative law

 If the chance of one event depends on the


outcome of the other event
E.g.: The chance a patient with some disease
survives to the next year depends on his having
survived to the present time.

 The notation is Pr(B/A), read as “the


probability event B occurs given that event A
has already occurred .”

03/09/2021 Tesfa S 14
Conditional probability and multiplicative law

 Let A and B be two events of a sample space S.


Then the conditional probability of an event A
given that B has already occured:
Pr ( A/B )= P(A n B) / P(B) , P(B)  0.
 Similarly, P(B/A) = P(A n B) / P(A) , P(A)  0.
 This can be taken as an alternative form of the
multiplicative law.

03/09/2021 Tesfa S 15
Conditional probabilities
Example:
 Suppose in country X the chance that an infant lives to age
25 is .95, whereas the chance that he lives to age 65 is .65.
For the latter, it is understood that to survive to age 65
means to survive both from birth to age 25 and from age
25 to 65. What is the chance that a person 25 years of age
survives to age 65?
Notation Event Probability
A Survive birth to age 25 .95
A and B Survive both birth to age 25 and age 25 to 65 .65
B/A Survive age 25 to 65 given survival to age 25 ?

 03/09/2021
Then, Pr(B/A) = Pr(A n B ) Tesfa
/ Pr(A)
S = .65/.95 = .684 . 16
Independent Events
 Among two events, the occurrence of one
does not affect the occurrence of the other.
 if events A and B are independent,
Pr(B/A) = P(B); Pr(A/B) = P(A).
 Example : A similar situation prevails with the
sex of offspring.
 The chance of a male is approximately ½.
 Regardless of the sexes of previous offspring,
the chance the next child is a male is still ½.

03/09/2021 Tesfa S 17
Independent Events
 When two events are independent, the multiplicative law becomes:
Pr(A and B) = Pr(A) Pr(B)

 Exercise
Consider the drawing of two cards one after the other from a deck of 52
cards. What is the probability that both cards will be spades?
A) with replacement
B) without replacement

03/09/2021 Tesfa S 18
Independent vs. Non-independent Events

 If A and B are independent, then

P(A and B) = P(A) x P(B)


which means that conditional probability is:
P(B | A) = P(A and B) / P(A) = P(A)P(B)/P(A) = P(B)

 We have a more general multiplication rule for events


that are not independent:

P(A and B) = P(B | A) × P(A)

03/09/2021 Tesfa S 19
Summary of multiplication rule

03/09/2021 Tesfa S 20
Basic Properties of probability
 Probability always ranges from 0 to 1; i.e., 0  Pr(A)
1
 If an event is certain to occur, its probability is 1,
 if an event is certain not to occur, its probability is 0.
 The sum of the probabilities that an event will occur
and that it will not occur is equal to 1; i.e.,
P(A) = 1 – P(A)
 If two events are mutually exclusive, then
Pr(A or B) = Pr(A) + Pr(B)
 If A and B are two independent events, then
Pr ( A and B) = Pr (A) Pr (B)
03/09/2021 Tesfa S 21
Probability distributions
 A random variable is a variable which can take
more than one value with given probability.
 A random variable is can be either discrete or
continuous
 A random variable is discrete if there are always
gaps between possible values
 A random variable is continuous if it can take any
value between any two possible values (no gaps).

03/09/2021 Tesfa S 22
Probability distributions
 The probability distribution is a table, graph, or
formula that shows the probabilities with
different values or ranges of values of the
random variable.
 The values of a probability distribution ranges
from 0 to 1.
 Since, a random variable takes one of its
possible values, the sum of all the values of a
probability distribution must be equal to 1.
03/09/2021 Tesfa S 23
Probability distributions
Example : toss a coin 3 times.
 Let x be the number of heads obtained. Find the probability distribution of
x.
 f (x) = Pr (X = xi) , i = 0, 1, 2, 3.
Pr (x = 0) = 1/8 …………………………….. TTT
Pr (x = 1) = 3/8 ……………………………. HTT THT TTH
Pr (x = 2) = 3/8 ……………………………..HHT THH HTH
Pr (x = 3) = 1/8 ……………………………. HHH

Probability distribution of X.
X = xi 0 1 2 3
Pr(X=xi) 1/8 3/8 3/8 1/8
 The required conditions are also satisfied.
 f(xi)  0
  f (xi) = 1

03/09/2021 Tesfa S 24
The Binomial distribution
 It is one of the most widely encountered
discrete distribution.
 It originates from Bernoulli’s trials
 Bernoulli’s trials is a single trial of an
experiment that result in only one of two
mutually exclusive outcomes
 success or failure; dead or alive; male or female

03/09/2021 Tesfa S 25
The Binomial distribution
 Suppose an event with binary outcomes A and B. Let
 The probability of A is  ,
 The probability of B is 1 - ,
 The probability  stays the same each time the event
occurs,
 The outcome is independent from one trial to another.
 Then the probability n  x A occurs
outcome exactly X times in n
P(x) =   p (1  p ) n-x
trails is ,
x
For x = 0,1,2…n

03/09/2021 Tesfa S 26
The Binomial distribution
•  Where,

0! =1.
 n and p are the binomial parameters that
specify the binomial distribution

03/09/2021 Tesfa S 27
Characteristics of Binomial distribution

 The experiment consist of n identical trials.


 There are only two possible outcomes in each trial.
 The probability of A remains the same from trial to trial.
 The probability A is denoted by p, and the probability of
B is denoted by q. Note that q=1‐p.
 The trials are independent
 The binomial random variable X is the number of
successes (A) in n trials.
 The mean is np and the variance is np(1‐p)

03/09/2021 Tesfa S 28
Example
 Each child born to a particular set of parents p
has a probability of 0.25 of having blood type
O. If these parents have 5 children.
What is the probability that
a. Exactly two of them have blood type O
b. At most 2 have blood type O
c. At least 4 have blood type O
d. 2 do not have blood type O.

03/09/2021 Tesfa S 29
Solution
Let X be the number of children with blood type O.
• X ~ B(5, 0.25)
a.) 5
  (0.25) 2 (1  0.25) 5 2
P(X=2)=  2 

= 0.2637
b.)
P(X≤2) = P(X=0) + P(X=1) + P(X=2)
= 0. 8965
c.) P(X≥4) = 1-P(X ≤3) =1-0.9844 = 0.0156
or P(X≥4) = P(X=4) + P(X=5)
= 0.0146 + 0.0010 = 0.0156
03/09/2021 Tesfa S 30
Solution
d.) Either 2 do not . 3 do so P(X=3) = 0.0879
Or Let NX be number of children who don’t have
blood type O. NX ~ B(5, 0.75)
P(NX=2)=0 .0879

03/09/2021 Tesfa S 31
Binomial distribution: example

• If I toss a coin 20 times, what’s the


probability of getting of getting 2 or fewer
heads?
 20  20!
0
 (.5) (.5) 
20
(.5) 20  9.5 x107 
0 20!0!
 20  20!
  (.5)1
(.5)19
 (.5) 20  20x9.5 x10 7  1.9 x105 
1 19!1!
 20  20!
2 18
 (.5) (.5)  (.5) 20  190x9.5 x10 7  1.8 x10 4
2 18!2!
4
03/09/2021
 1. 8 x10 Tesfa S 32
Practice Problem:
You are conducting a case-control study of smoking
and lung cancer. If the probability of being a
smoker among lung cancer cases is .6, what’s the
probability that in a group of 8 cases you have:

a. Less than 2 smokers?


b. More than 5?
c. What are the expected value and variance of the number of
smokers?

03/09/2021 Tesfa S 33
Answer
X P(X)
8
0 1(.4) =.00065
1 8(.6)1 (.4) 7 =.008
2 6
2 28(.6) (.4) =.04
3 56(.6)3 (.4) 5 =.12
4 4
4 70(.6) (.4) =.23
5 56(.6)5 (.4) 3 =.28
6 2
6 28(.6) (.4) =.21
7 1
7 8(.6) (.4) =.090
8
8 1(.6) =.0168

0 1 2 3 4 5 6 7 8
03/09/2021 Tesfa S 34
Answer, continued

P(<2)=.00065 + .008 = .00865 P(>5)=.21+.09+.0168 = .3168

0 1 2 3 4 5 6 7 8

E(X) = 8 (.6) = 4.8


Var(X) = 8 (.6) (.4) =1.92
StdDev(X) = 1.38
03/09/2021 Tesfa S 35
Exercise

What’s the probability of getting exactly 5 heads in


10 coin tosses?

 10  5 5
a.  (.50) (.50)
0

b.  
10
5
(.50) (.50)
5

 5

c.  
10
10
(.50) (.50)
5

 5
d.  10
 10 0
 (.50) (.50)
 10 

03/09/2021 Tesfa S 36
The Poisson distribution

 A kind of discrete probability distribution that


applies to occurrence of some event over a
specified time interval.
 Usually used for rare events
 Example
- Daily number of new HIV cases notified to a
HIV registry
- Number of abnormal cells in a fixed area of
histological slides from a series of liver biopsies
03/09/2021 Tesfa S 37
The Poisson distribution
•  Suppose events happen randomly and
independently in time at a constant rate. If
events happen with rate λ events per unit
time, the probability of x events happening in
unit time is
P(x) = where e≈ 2.71828

03/09/2021 Tesfa S 38
Example
• The daily number of new registrations of HIV is
2.2 on average
what is the probability of
a) Getting no new cases
b) Getting 1 case
c) Getting 2 cases
d) Getting 3 cases
e) Getting 4 cases

03/09/2021 Tesfa S 39
Solution
a) P(x= 0) = = 0.111
b) p(X=1) = 0.244
c) p(x=2) = 0.268
d) p(x=3) = 0.197
e) p(x=4) = 0.108

03/09/2021 Tesfa S 40
Characteristics of poison distribution
 The random variable x is the number of occurrences of
an event over some interval.
 The occurrences must be random.
 The occurrence must be independent of each other.
 The occurrence must be uniformly distributed over the
interval being used.

03/09/2021 Tesfa S 41
Characteristics of poison distribution
 The Poisson distribution is very asymmetric
when its mean is small
 With large means it becomes nearly symmetric
 It has no theoretical maximum value, but the
probabilities tail off towards zero very quickly
 λ is the parameter of the Poisson distribution
 The mean is λ and the variance is also λ

03/09/2021 Tesfa S 42
Exercise
 Suppose that in a certain malarious area past experience
indicates that the probability of a person with a high fever will be
positive for malaria is 0.7. Consider 3 randomly selected patients
(with high fever) in that same area.

a) What is the probability that no patient will be positive for


malaria?
b) What is the probability that exactly one patient will be positive for
malaria?
c) What is the probability that exactly two of the patients will be
positive for malaria?
d) What is the probability that all patients will be positive for
malaria?
03/09/2021 Tesfa S 43
The Normal distribution
 It is by far the most important probability distribution in
statistics.
 It is also sometimes known as the Gaussian distribution
 The distributions of many medical measurements in
populations follow a normal distribution
(e.g. Serum uric acid levels, cholesterol levels, blood
pressure, height, weight, etc.).
 The normal distribution is a theoretical, continuous
probability distribution whose equation is:

1  x   2
-  
f(x) 
1
e 2   
2 
03/09/2021 Tesfa S
for - < x < + 44
Characteristics of the Normal Distribution

 It is a probability distribution of a continuous variable. It


extends from - to +.
 It is uni-modal, bell-shaped and symmetrical about x =u.
 The mean, the median and mode are all equal
 The total area under the curve above the x-axis is one
square unit.
 The curve never touches the x-axis.
 It is determined by two quantities: its mean (  ) and SD
(  ).
 An observation from a normal distribution can be related
to a standard normal distribution (SND) which has a
published table.

03/09/2021 Tesfa S 45
Effects of  and 

03/09/2021 Tesfa S 46
Understanding of ‘’

03/09/2021 Tesfa S 47
Standard normal distribution
 Since the values of  and  depend on the particular
problem in hand and tables of the normal distribution
cannot be published for all values of  and 
 Calculations are made by referring to the standard
normal distribution which has  = 0 and  = 1.
 An observation x from a normal distribution with
mean  and standard deviation  can be related to a
Standard normal distribution by calculating :
SND = Z = (x -  ) / 

03/09/2021 Tesfa S 48
Area under any Normal curve
 To find the area under a normal curve with mean  and standard
deviation  between x=a and x=b
 Find the Z scores corresponding to a and b (call them Z1 and Z2) and
 Find the area under the standard normal curve between Z 1 and Z2
from the table.
 Z- Scores or Z-value : The number of standard deviation units.
E.g. : Assume a distribution has a mean of 70 and a standard deviation of
10.
How many standard deviation units above the mean is a score of 80?
( 80-70) / 10 = 1
 How many standard deviation units above the mean is a score of 83?
Z = (83 - 70) / 10 = 1.3
03/09/2021 Tesfa S 49
Area under any Normal curve
Example: Suppose a borderline hypertensive is defined as a person whose
DBP is between 90 and 95 mm Hg inclusive, and the subjects are 35-44-year-
old males whose BP is normally distributed with mean 80 and variance 144.
What is the probability that a randomly selected person from this
population will be a borderline hypertensive?
Solution: Let X be DBP, X ~ N(80, 144)
P (90 < X < 95) = P(0.83 < z < 1.25)
= P(Z>0.83) – P(Z>1.25)
= 0.2033 – 0.1056 = 0.0977
Also, in a slightly different way, = P (Z < 1.25)  P(Z < 0.83)
= 0.8944  0.7967 = 0.0977
Thus, approximately 9.8% of this population will be
borderline hypertensive.

03/09/2021 Tesfa S 50
Table of standard normal distribution

Example: 2
 Suppose we want to compute the area under the normal curve
to the right of 1.45
 This area can be computed by finding the probability under the
normal curve.
 The probability can be read at the normal curve by combining
the value of 1.4 under the first column and 0.05 under the first
row.
 The area of this shaded portion is 0.4265 (or 42.65% of the total
area under the mean and Z=1.45.
 Thus the area at the right of 1.45 is
0.5-0.4265 = 0.0735

03/09/2021 Tesfa S 51
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3304 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 03/09/2021
0.4821 0.4826 0.4830 0.4834 Tesfa
0.4838
S
0.4842 0.4846 0.4850 0.4854 0.4857
52
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
Example

 Find the area under the standard normal curve for the
following, using the z-table between z = 0 and z = 0.78
a. between z = -0.56 and z = 0
b. between z = -0.43 and z = 0.78
c. between z = 0.44 and z = 1.50
d. to the right of z = -1.33.
Solution:
e. p(0≤z≤0.78)= 0.2823
f. P(-0.56 ≤z ≤0) = pr(0 ≤z ≤0.56) = 0.2123

03/09/2021 Tesfa S 53
c. Pr(-0.43 ≤ z ≤ 0.78) = pr(-0.43 ≤ z ≤ 0) +pr(0 ≤ z ≤
0.78)
= pr(0≤ z ≤ 0.43) +pr(0 ≤ z ≤ 0.78)
= 0.1664 +0.2823 = 0.4487
d. Pr(0.44 ≤ z ≤ 1.50)
= pr(0 ≤ z ≤1.50) –pr(0 ≤ z ≤0.44) = 0.4332 -0.170
= 0.2632
e. Pr(-1.33 ≤ z) = 0.5 + pr(0 ≤ z ≤ 1.33)
= 0.5 + 0.4082 = 0.9082

03/09/2021 Tesfa S 54
Exercise
Assume a population has distribution with mean of 70 and a
standard deviation of 10.
a) What is the probability that a randomly selected person from
this population will have DBP below 90 mm Hg ?

b) What is the probability that a randomly selected person from


this population will have DBP above 95 mm Hg ?

c) Add the probabilities you found for the non-hypertensive,


borderline hypertensive and hypertensive individuals. What
do you learn from your answers?

03/09/2021 Tesfa S 55
Thank you

03/09/2021 Tesfa S 56
Quiz
1. Which pdf i used for continuous data?
A. Bernoulli B. Binomial C. Poisson D. Normal
2. The parameter for Poisson distribution is?
A. n B. p C.  D. none
3. What is independent events
4. What is mutually exclusive events
5. Pr(Z< -1.96)

03/09/2021 Tesfa S 57

You might also like