L5 Probability & Probability Distribution

Probability and
Probability Distributions
By: Nigussie Y(BSc, MPH /Epid & Biost, Assistant
Professor)
February 11, 2023

 Probability is a measure of how likely it is for an event to happen
 Probability is the language of chance.
 Probability theory was developed out of attempting to solve

problems related to games of chance such as tossing a coin,
rolling a die etc.
 i.e. trying to quantify personal beliefs regarding degrees of

uncertainty.
 Because medicine is an inexact science, physicians seldom can predict
an outcome with absolute certainty.
 E.g. to formulate a diagnosis a physician must rely on available
diagnostic information about a patient.
 History and physical examination
 Laboratory studies, X‐ray findings, ECG, etc
 Although no test result is absolutely accurate, it does affect the

probability of the presence or absence of a disease.
 Probability theory allows us to draw conclusions about a

population of patients based on known information of sample
patients drawn from that population.
Definition:
 Classical Probability: An event’s probability is the ratio of the
number of favorable outcomes and possible outcomes in an equally
likely experiment (m/n).
 m = success & n = possible outcomes
E.g. the probability of the occurrence of a head tossing a coin is 0.5.

Examples:
1. Roll a fair die
2. Select a SRS of size 2 from a population
 Relative frequency probability: The probability of an event is the
proportion of times it occurs when exactly the same experiment is
repeated a very large number of times in independent trials.
Probability: Relative Frequency
An estimate of the probability of an event happening can be obtained by looking

back at experimental or statistical data to obtain relative frequency.
number of times that the event occurs

Relative Frequency =
number of trials
No freq Relative freq

1 25 25/250 = 0.1
2 34 34/250 = 0.136
3 32 32/250 = 0.128
4 30 30/250 = 0.12
5 34 34/250 = 0.136
6 95 95/250 = 0.38
250 trials
 Subjective probability: A subjective probability is an individual’s
degree of belief in the occurrence of an event..
E.g. If some one says that he is 95% certain that a cure for AIDS
will be discovered within 5 years, then he means that
Pr(discovery of cure of AIDS within 5 years) = 95%.
Basic terms
Experiment: Any activity from which results are obtained.
A random experiment is one in which the outcomes cannot be predicted with

certainty.
Examples:
1. Flip a coin, Flip a coin 3 times, Roll a die
2. Draw a SRS of size 50 from a population
Trial: A physical action , the result of which cannot be predetermined
Basic Outcome (o): A possible outcome of the experiment
Sample Space: The set of all possible outcomes of an experiment
Events: Collections of outcomes from the sample space.

Venn Diagram: Graphical representation of sample space and events
Mutually exclusive events and the additive law:
• Two events A and B are mutually exclusive if they have no
elements in common.
• Thus, if A and B are mutually exclusive events, Pr(A or B) = Pr
(A) + Pr(B).
Eg. One die is rolled. Sample space = S = (1,2,3,4,5,6)

• Let A = the event an odd number turns up, A = (1,3,5)
• Let B = the event 1,2 or 3 turns up; B = (1,2,3 )
• Let C = the event 2 turns up, C= (2)

i. Find Pr (A), Pr (B) and Pr (C)
ii. Are A and B; A and C; B and C mutually exclusive?
Answers:
i. Pr(A) = Pr(1) + Pr(3) + Pr(5) = 1/6+1/6+ 1/6 = 3/6 = 1/2
Pr(B) = Pr(1) + pr(2) + Pr(3) = 1/6+1/6+1/6 = 3/6 = ½
Pr ( C ) = Pr(2) = 1/6
A and B are not mutually exclusive. Because they have the elements 1
and 3 in common
similarly, B and C are not mutually exclusive. They have the element 2
in common.
 A and C are mutually exclusive. They don’t have any element in
common
When A and B are not mutually exclusive, Pr(A or B) = Pr (A) +

Pr(B) – Pr(A and B).
Conditional probabilities and the multiplicative law:
• If the chance of a particular event depends on the outcome of some
other event, they are not mutually exclusive events.
• The notation is Pr(B/A), which is read as “the probability event B
occurs given that event A has already occurred .”
• Let A and B be two events of a sample space S.
• The conditional probability of an event A given B denoted by
Pr ( A/B )= P(A n B) / P(B) , P(B) ≠ 0.
• Similarly, P(B/A) = P(A n B) / P(A) , P(A) ≠ 0.
• This can be taken as an alternative form of the multiplicative law.

E.g. Suppose in country X the chance that an infant lives to age 25 is .95,
whereas the chance that he lives to age 65 is .65
 For the latter it is understood that to survive to age 65 means to survive
both from birth to age 25 and from age 25 to 65.
Independent Events
• The occurrence or nonoccurrence of one event does not in any way
affect the occurrence or nonoccurrence of the other event.
• Thus, if events A and B are independent, Pr(B/A) = P(B) and Pr(A/B) =

P(A).
 With independent events, the multiplicative law becomes:
Pr(A and B) = Pr(A) Pr(B)

 Hence, Pr(A) = Pr(A and B) / Pr(B) , where Pr(B) ≠ 0
 Pr(B) = Pr(A and B) / Pr(A) , where Pr(A) ≠ 0

Properties of probability
1. Probabilities are real numbers on the interval from 0 to 1.
2. If two events are mutually exclusive ,then Pr(A or B) = Pr(A) + Pr(B).
3. If A and B are two events, not mutually exclusive , then Pr( A or B)= Pr
(A) +Pr (B) – Pr( A and B).
4. The sum of the probabilities that an event will occur and that it will not
occur is equal to 1; hence, P(A’) = 1 – P(A)
5. If A and B are two independent events, then Pr ( A and B) = Pr (A) Pr

(B)
Probability distribution
• The term probability distribution refers to the collection of all

possible outcomes along with their probabilities.
• Every random variable has a corresponding probability distribution.
• A probability distribution of a random variable can be displayed by

a table or a graph or a mathematical formula.
• With categorical variables we obtain the frequency distribution of

each variable.
E.g. Toss a coin 3 times.
Let x be the number of heads obtained.
Find the probability distribution of x .

 f (x) = Pr (X = xi) , i = 0, 1, 2, 3.
 Pr (x = 0) = 1/8 …………………………….. TTT
 Pr (x = 1) = 3/8 ……………………………. HTT THT TTH
 Pr (x = 2) = 3/8 ……………………………..HHT THH HTH
 Pr (x = 3) = 1/8 ……………………………. HHH

 Probability distribution of X (i.e. probability
distribution of heads)
X = xi 0 1 2 3
Pr(X=xi) 1/8 3/8 3/8 1/8

The binomial distribution
 The binomial distribution is the probability distribution that
results from doing a “binomial experiment
 A process that has only two possible outcomes is called a
binomial process.
 In statistics the two outcomes are frequently denoted as success
and failure.
 The probabilities of a success or a failure are denoted by p and
q, respectively. Note that p + q = 1.
Binomial distribution…
Binomial assumptions:
1) The same experiment is carried out n times ( n trials are made).
2) Each trial has two possible outcomes ( usually these outcomes are
called “ success” and “ failure”.
 If P is the probability of success in one trial, then 1-p is the probability
of failure.
3) The result of each trial is independent of the result of any other trial.
 If the binomial assumptions are satisfied, the probability of x
successes in n trials is:
• Pr(X=success) = Pr(X=1) = p
• Pr(X=failure) = Pr(X=0) = 1-p
• Then Pr(X=x) = n! p x (1- p)n-x

x!(n-x)!
Binomial distribution…
 Suppose that in a certain population 52% of all recorded births are
males.
 If we select randomly 10 birth records What is the probability that
exactly
 5 will be males?
Pr(X=5) = 10! 0.52 5 (1- 0.52)10-5 =0.24

5!(10-5)!
 3 or more will be females?
Pr(X≥3) = 1- Pr(X<3) = Pr(X=0)+Pr(X=1)+Pr(X=2)
=1-[0.001+0.013+0.111]= 1-0.125=0.875
I. Probability distribution of a categorical variables
 Specifies all possible outcomes of the categorical variable along
with the probability that each will occur.
E.g. Consider the value on the face showing up from tossing a die.
The probability distribution of this variable is
Value on Face 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/6
• Notice that the total probability is 1.

II. Probability distribution of a discrete variable
 Probability distributions can be estimated from relative
frequencies. Consider the number of televisions per household (X)
from US survey data.
1,218 ÷ 101,501 = 0.012
e.g. P(X=4) = P(4) = 0.076 = 7.6%

7.23
III. Probability distribution of continuous variables
• They are called probability density functions (pdfs)
• The Probability Density Function of the random variable X is the
curve such that the area under the curve between any two points a
and b is equal to the probability that the random variable X, falls
between a and b.
• Thus, the total area under the curve over the possible range of
values for the random variable is 1.
E.g. Suppose, X represents the continuous variable ‘Height’; rarely is an
individual exactly equal to 170cm tall
– X can assume an infinite number of intermediate values 170.1,
170.2, 170.3 etc.
• Because a continuous random variable X can take on an infinite number
of values, the probability associated with any particular value is almost
equal to zero.
• However the probability that X will assume some value in the interval
enclosed by two ranges say x1 and x2
Properties of continuous probability
Distributions:
1. Area under the curve = 1.
2. P(X = a) = 0, where a is a constant.
3. Area between two points a and b = P(a<x<b) .

The Normal Distribution
• It is the probability distribution of continuous variables and has
an especially important role in statistics.
Characteristics of the normal distribution:
1) It extends from minus infinity( -∞) to plus infinity (+∞).
2) It is uni-modal, bell-shaped and symmetrical about x = u.
3) It is determined by two quantities: its mean ( μ ) and SD ( σ ).
4. The height of the frequency curve cannot be taken as the

probability of a particular value
The Normal Distribution
• The formula that generates the normal
probability distribution is:
2
1  x 
1  
2  

f ( x)  e for   x 
 2
e  2.7183   3.1416
 and  are the population mean and standard deviation.
• The shape and location of the normal curve changes as the

mean and standard deviation change.
The standard normal distribution
 To find P(c < x < d), we need to find the area under the appropriate
normal curve. 1  x  μ 2
d 1  
 P(c < X < d) =  e 2  σ  dx
c 2 πσ
 To simplify the tabulation of these areas, we standardize each value

of x by expressing it as a z-score, the number of standard deviations
it lies from the mean.
x
z

 Since a normal distribution could be an infinite number of

possible values for its mean and SD, it is impossible to tabulate
the area associated for each and every normal curve.
 Instead only a single curve for which μ = 0 and σ = 1 is

tabulated.
 The curve is called the standard normal distribution (SND).

The Standard Normal (z)
Distribution
• Mean = 0; Standard deviation = 1
• When x = mean , z = 0
• Symmetric about z = 0
• Values of z to the left of center are negative
• Values of z to the right of center are positive
• Total area under the curve is 1.

μ-3σ μ-2σ μ-σ μ μ+σ μ+2σ μ+3σ
Fig. Percentage of area under a normal distribution with mean μ and
standard deviation σ
For any normal distribution

 about 68% (most) of the observations is contained within one SD of the mean.
about 95% (majority) of the probability is contained within two SDs
and 99% (almost all) within three SDs of the mean.
 Assume a distribution has a mean of 70 and a standard deviation

of 10.
 How many standard deviation units above the mean is a score of
80?
Answer: Z= ( 80-70) / 10 = 1
 How many standard deviation units above the mean is a score of
83?
Answer: Z = (83 - 70) / 10 = 1.3

Using normal table
The four digit probability in a particular row and column of Table 1

gives the area under the z curve to the left that particular value of z.
Area for z  1.36

Example
Use Table 1 to calculate these probabilities:
P(z 1.36) = .9131
P(z >1.36)
= 1 - .9131 = .0869
P(-1.20  z  1.36)
= .9131 - .1151
= .7980
 From the symmetry properties of the stated normal distribution,
P(Z ≤ -x) = P(Z ≥ x) = 1– P(z ≤ x)
 Hence, P(-1 < Z < +1) = 0.6827
 P(-1.96< Z < + 1.96) = 0.95 and
 P(-2.576 < Z < + 2.576) = 0.99

Example
The weights of packages of ground beef are normally distributed with

mean 1 pound and standard deviation .10. What is the probability that
a randomly selected package weighs between 0.80 and 0.85 pounds?
P (.80  x  .85) 
P(2  z  1.5) 
.0668  .0228  .0440
Example
What is the weight of a package such that only
1% of all packages exceed this weight?
P ( x  ?)  .01
? 1
P( z  )  .01
.1
? 1
From Table 3,  2.33
.1
?  2.33(.1)  1  1.233
Exercises
• Find the probability of the following under the SND
– Above 1.96?
– Below –1.96?
– Between –1.28 and 1.28?
– Between –1.65 and 1.08? 0.8104
– What level cuts the upper 25%?
– What level cuts the lower 10%?
– What level cuts the middle 99%?

Exercises
1. Let X be systolic blood pressure (for US population aged 18-74

males μ = 129 mmHg and σ = 19.8 mmHg).
– What level encompasses the middle 95%?
– What proportion of men in the population have SBP greater

than 150mmHg?
– What level cuts the lower 10% of SBP?
.
2. The probability that a persons suffering from
migraine headache will obtain reliefs with a
particular drug is 0.9. three randomly selected
sufferers from migraine headache are given
the drug.
• Find the probability that the number obtaining
reliefs will be.
A. exactly zero
B. more than one
C. Two or fewer
!! !
ks
a n s
T h l e s
 d b
G o
No Women should die while giving birth 44

L5 Probability & Probability Distribution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L5 Probability & Probability Distribution

Uploaded by

Copyright:

Available Formats

Probability and

February 11, 2023

 Probability is the language of chance.

 Probability theory was developed out of attempting to solve

 i.e. trying to quantify personal beliefs regarding degrees of

 Although no test result is absolutely accurate, it does affect the

 Probability theory allows us to draw conclusions about a

E.g. the probability of the occurrence of a head tossing a coin is 0.5.

An estimate of the probability of an event happening can be obtained by looking

number of times that the event occurs

No freq Relative freq

A random experiment is one in which the outcomes cannot be predicted with

Events: Collections of outcomes from the sample space.

Eg. One die is rolled. Sample space = S = (1,2,3,4,5,6)

• Let B = the event 1,2 or 3 turns up; B = (1,2,3 )

• Let C = the event 2 turns up, C= (2)

ii. Are A and B; A and C; B and C mutually exclusive?

i. Pr(A) = Pr(1) + Pr(3) + Pr(5) = 1/6+1/6+ 1/6 = 3/6 = 1/2

Pr(B) = Pr(1) + pr(2) + Pr(3) = 1/6+1/6+1/6 = 3/6 = ½

When A and B are not mutually exclusive, Pr(A or B) = Pr (A) +

• If the chance of a particular event depends on the outcome of some

other event, they are not mutually exclusive events.

• The notation is Pr(B/A), which is read as “the probability event B

occurs given that event A has already occurred .”

• Let A and B be two events of a sample space S.

• The conditional probability of an event A given B denoted by

Pr ( A/B )= P(A n B) / P(B) , P(B) ≠ 0.

• Similarly, P(B/A) = P(A n B) / P(A) , P(A) ≠ 0.

• This can be taken as an alternative form of the multiplicative law.

• Thus, if events A and B are independent, Pr(B/A) = P(B) and Pr(A/B) =

Pr(A and B) = Pr(A) Pr(B)

 Pr(B) = Pr(A and B) / Pr(A) , where Pr(A) ≠ 0

1. Probabilities are real numbers on the interval from 0 to 1.

2. If two events are mutually exclusive ,then Pr(A or B) = Pr(A) + Pr(B).

5. If A and B are two independent events, then Pr ( A and B) = Pr (A) Pr

• The term probability distribution refers to the collection of all

• Every random variable has a corresponding probability distribution.

• A probability distribution of a random variable can be displayed by

• With categorical variables we obtain the frequency distribution of

Let x be the number of heads obtained.

Find the probability distribution of x .

 Pr (x = 0) = 1/8 …………………………….. TTT

 Pr (x = 1) = 3/8 ……………………………. HTT THT TTH

 Pr (x = 2) = 3/8 ……………………………..HHT THH HTH

 Pr (x = 3) = 1/8 ……………………………. HHH

Pr(X=xi) 1/8 3/8 3/8 1/8

1) The same experiment is carried out n times ( n trials are made).

• Pr(X=failure) = Pr(X=0) = 1-p

• Then Pr(X=x) = n! p x (1- p)n-x

Pr(X=5) = 10! 0.52 5 (1- 0.52)10-5 =0.24

Pr(X≥3) = 1- Pr(X<3) = Pr(X=0)+Pr(X=1)+Pr(X=2)

Probability 1/6 1/6 1/6 1/6 1/6 1/6

• Notice that the total probability is 1.

e.g. P(X=4) = P(4) = 0.076 = 7.6%

2. P(X = a) = 0, where a is a constant.

3. Area between two points a and b = P(a<x<b) .

Characteristics of the normal distribution:

1) It extends from minus infinity( -∞) to plus infinity (+∞).

2) It is uni-modal, bell-shaped and symmetrical about x = u.

3) It is determined by two quantities: its mean ( μ ) and SD ( σ ).

4. The height of the frequency curve cannot be taken as the

• The shape and location of the normal curve changes as the

 To simplify the tabulation of these areas, we standardize each value