Probability and Probability Distributions

PROBABILITY
AND PROBABILITY
DISTRIBUTIONS
Note: This is essentially the same presentation that the
Training on Statistical Inference Philippine Statistical Association is using in its training
with the Use of MS where it acknowledged “Elementary Statistics: A Handbook
of Slide Presentation prepared by Z.V.J. Albacea, C.E.
Excel Reano, R.V. Collado, L.N. Comia and N.A. Tandang in
2005 for the Institute of Statistics, CAS, UP Los Banos” as
ILC UP Cebu a primary source of the materials.
July 17-18, 2009 However, I made revisions, additions and deletions. I take
responsibility for errors and omissions that might be
present.
N. T. Ison, UPVCC.
TEACHING BASIC STATISTICS …
Motivation for Studying Chance
Sample Statistic Estimates Population Parameter

e.g. Sample Mean X estimates Population Mean 
Questions:
n How do we assess the reliability of our estimate?
n What is an adequate sample size? [ We would expect a
large sample to give better estimates. However, large
samples are more costly.]
Session 2.2
An Approach to Solve the Questions
If sample was chosen through

chance processes, we have to
understand the notion of probability
and sampling distribution.
Session 2.3
To introduce probability….
 Random experiment
 Sample space
 Event as subset of sample
space
 Likelihood of an event to occur
- probability of an event
Session 2.4
Features of a Random Experiment
 All outcomes are known in

advance.
 The outcome of any one trial
cannot be predicted with
certainty.
 Trials can be repeated under
identical conditions.
Session 2.5
EXAMPLES
 Rolling a die and
observing the
number of dots on
the upturned face
 Tossing a one-peso
coin and observing
the upturned face
 Measuring the
height of a student
enrolled this term
Session 2.6
SAMPLE SPACE
 It is the set of all possible
outcomes of a random experiment.
 Any performance of the
experiment results in exactly one
outcome in the sample space.
 It is usually denoted by S or Ω
Session 2.7
ILLUSTRATION
Rolling a die and observing
the number of dots on the
upturned face
S={ , , , , , }
S={1, 2, 3, 4, 5, 6}
Session 2.8
EVENT
 A subset of the sample space whose
probability is defined
 Usually denoted by capital letters like E, A
or B
 Observance of the elements of the subset
implies the occurrence of the event, i.e.,
an event occurred if the outcome of the
experiment belongs in the event
 Can either be classified as simple or
compound event
Session 2.9
ILLUSTRATION
S = {1, 2, 3, 4, 5, 6}
An event of An event of
observing odd- observing even-
number of dots number of dots
in a roll of a die in a roll of a die
E1 = { 1, 3, 5} E2 = { 2, 4, 6}
Session 2.10
Visualizing Events
 Contingency Tables
Ace Not Ace Total
Black 2 24 26
Red 2 24 26
Total 4 48 52
 Tree Diagrams Ace

Red Cards Not an Ace
Full Deck
Ace
of Cards Black Cards
Not an Ace
Session 2.11
Mutually Exclusive Events
Two events are mutually exclusive if

the two events cannot occur
simultaneously.
Example:
Coin toss: either a head or a tail, but not
both. The events head and tail are
mutually exclusive.
Session 2.12
PROBABILITY
1 Certain
 The numerical measure of
the likelihood that an event
will occur
 Between 0 and 1 0.5
Note: Sum of the probabilities
of all mutually exclusive
events in the sample space
is 1
0 Impossible
Session 2.13
Assigning Probabilities
 Subjective
confident student views chances of passing
a course to be near 100 %
 Logical
symmetry/equally likely: coin, dice, cards etc.
(A PRIORI assignment)
 Empirical
chances of rain 75 % since it rained 15 out of
past 20 days (A POSTERIORI)
Session 2.14
Computing Probability: A Priori
The probability of an event E can be computed

as the sum of the probabilities of all the
outcomes found in the event E, that is,
P[E] = sum of P{e}
where e is an element of event E.
Example: If we select a card at random from a well-shuffled deck of 52 cards then,
P(getting an ace) = P(ace of spades) + P(ace of clubs) + P(ace of diamonds)

+ P(ace of hearts)
= 1/52 + 1/52+ 1/52 + 1/52 = 4/52
Session 2.15
If all possible outcomes can be listed and

are equally likely to occur, we can compute
the Probability of an Event E as:
Number of Event Outcomes
P( E ) 
Total Outcomes
Example: If we select a card at random from a well-
shuffled deck of cards then,
P(ace in a deck of cards) = 4/52
since there are 4 aces in a deck of (52) cards.
Session 2.16
If the outcomes have different

likelihood of occuring, then the
probability of an event E has to
be computed as the sum of the
probabilities of the outcomes
found in the event E.
Session 2.17
Computing Joint Probability
The probability of a joint event, A and B:

P(A and B) = P(A B)
= sum of P{e}
where e are the outcomes in both A and B
E.g. P(Red Card and Ace) = P(ace of diamond) + P(ace of heart)

1 1 1
  
52 52 26
Session 2.18
Computing Joint Probability

If the sample space contains equiprobable outcomes then the
probability of a joint event, A and B:
P(A and B) = P(A B)

number of outcomes from both A and B

total number of possible outcomes in sample space
E.g. P(Red Card and Ace)
2 Red Aces 1
 
52 Total Number of Cards 26
Session 2.19
ILLUSTRATION
S = {1, 2, 3, 4, 5, 6}
 Assuming that the probability of each of the
outcomes 1,2, and 3 is 1/12 while each of the
outcomes 4, 5 and 6 has likelihood to occur
equal to 1/4.
 The probability of an event of observing odd-
number of dots in a roll of a die is
P[E1] = sum of P{1}, P{3} and P{5}
= 1/12 + 1/12 + 1/4 = 5/12.
Session 2.20
A POSTERIORI APPROACH
 The random experiment has to be

performed (under uniform condition for
a large number of times) and the event
of interest is observed.
 The probability of the event is the

(limiting value) of the relative frequency
of the occurrence of such event if the
experiment is endlessly repeated.
Session 2.21
ILLUSTRATION
 Suppose the experiment was done for
100 times and it was observed that an
odd-number of dots occurred 60 times
and even-number of dots occurred 40
times.
 The (empirical) probability of an event
of observing odd-number of dots in a
roll of a die is the relative frequency of
the event or P[E1] = 60/100 = 0.6
Session 2.22
Rules on Probability
 Property 1. The probability of an event E
is any number between 0 and 1 inclusive.
and P()=1 while P()=0.
 Property 2. The sum of the probabilities
of a set of mutually exclusive and
exhaustive events (all events in the
sample space) is 1. (n events are
mutually exclusive if no pair of events
among the n can occur simultaneously)
Session 2.23
Rules on Probability
 Property 3. Addition Rule

P(A or B) = P(A) + P(B) - P(A and B)
A
B
Session 2.24
Computing Probability
 P(King or Spade)
= P(King) + P(Spade) - P(King and Spade)
4 13 1 16 4
=    
52 52 52 52 13
 P(King or Queen) = P(King) + P(Queen)
4 4 8 2
=   
52 52 52 13
since King and Queen are mutually exclusive then P(King and Queen)=0
Session 2.25
Marginal Probability
A Deck of 52 Cards
Color
Type Red Black Total
Ace 2 2 4
Non-Ace 24 24 48
Total 26 26 52
P(Ace) = 4
52
Session 2.26
Definition of Conditional Probability
The conditional probability of event B given

that event A has already occurred, denoted
by P(B|A) is defined as:
P( A  B)
P( B | A) 
P( A)
if P(A) > 0. Otherwise, it is undefined.
Session 2.27
Conditional Probability
A Deck of 52 Cards
Color
Type Red Black Total
Ace 2 2 4
Non-Ace 24 24 48
Total 26 26 52
P(Ace and Red) 2 / 52 2

P(Ace | Red)   
P(Red) 26 / 52 26
Session 2.28
Definition of Independent Events
The events A and B are independent if and

only if:
P( A  B)  P( A) P ( B)
This condition is equivalent to saying that

P(A|B)=P(A), or that P(B|A)=P(B).
Session 2.29
Examples:*
1. Consider the following events in the toss of a single

die:
A: Observe an odd number
B: Observe an even number
Are A and B independent events?
2. The probability that Robert will correctly answer the
toughest question in an exam is ¼. The probability
that Ana will correctly answer the same question is
4/5. Find the probability that both will answer the
question correctly, assuming that they do not copy
from each other.
*from Stat101 Manual, UP Stat, Diliman Session 2.30

Random Variable
 A rule or a function that maps each element
of the sample space of an experiment to one
and only one real number
 Each value that a random variable can take
has a probability associated with it.
 The set of all the values that the random
variable can take together with the
corresponding probabilities define the
probability distribution of the random
variable.
Session 2.31
Random Variable
 A random variable defined on a sample
space that is countable* is a discrete random
variable, e.g., no. of persons in a household,
no. in favor of a proposition.
 A random variable that can take all possible
values within a range is a continuous
random variable, e.g., weight, income,
temperature, crop yield.
*Countable – set is finite or can be mapped to the set Session 2.32

of natural numbers = {0, 1, 2, …}
Discrete Random Variable

-illustration
Rolling two dice and observing the
number of dots on the upturned faces.
S={ (1,1), (1,2), (1,3), (1,4), (1,5), (1,6)
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6)
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6)
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6)
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)
(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}
Session 2.33

-illustration
We define a random variable X as the total number of dots on the upturned faces.
Sample Points x
(1,1),
2
(1,2), (2,1),
3
(1,3), (2,2), (3,1),
4
(1,4), (2,3), (3,2), (4,1),
5
(1,5), (2,4), (3,3), (4,2), (5,1),
6
(1,6), (2,5), (3,4), (4,3), (5,2), (6,1),
7
(2,6), (3,5), (4,4), (5,3), (6,2),
8
(3,6), (4,5), (5,4), (6,3),
9
(4,6), (5,5), (6,4),
10
(5,6), (6,5),
11
(6,6)
12
Session 2.34

-illustration
 The random variable X takes on the values 2, 3,
4, 5, 6, 7, 8, 9, 10, 11 and 12.
 Some of the values had more corresponding
elements in the sample space. For example,
X=2 corresponds to only one outcome (1,1),
while X=3 corresponds to 2 outcomes (1,2) and
(2,1).
 The probability that the (discrete) random
variable will take a particular value is equal to
the sum of the probabilities of the outcomes in
the sample space corresponding to that value.
Session 2.35

-illustration
The probability that the random variable X will take
the value 4 is equal to the sum of the
probabilities of the corresponding outcomes.
Thus
P(X=4) = P{(1,3)} + P{(2,20} + P{(3,1)}
For example, If the dice are fair, each outcome in
the sample space has probability of 1/36. Thus,
P(X=4) = 1/36+ 1/36+ 1/36
= 3/36 or 1/12.
Session 2.36
Discrete Random Variable -

PROBABILITY DISTRIBUTION
The probability distribution of a discrete

random variable is a table or a function that
presents all the possible values of the
random variable and its corresponding
probabilities.
Session 2.37

The probability distribution of the random variable, X defined

as the total number of dots on the upturned faces in a roll of
two fair dice, is presented as a table below:
X 2 3 4 5 6 7 8 9 10 11 12
P[X=x] 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
0.20
0.15
0.10
0.05
0.00
2 3 4 5 6 7 8 9 10 11 12
X = Total Num ber of Dots on the Upturned face s
Session 2.38
MEAN OF A DISCRETE RANDOM

VARIABLE
 If X is a discrete random variable with probability distribution P(X=x)
x x1 x2 … xn
P(X=x) P(X=x1) P(X=x2) … P(X=xn)
Then the expected value of X, also referred to as the mean of X is:

E(X) = µ = x1P(X=x1) + x2P(X=x2) + … + xnP(X=xn)
 In interpreting the mean of X, the collection of data points that we

are summarizing is now an infinite collection containing all of the
realized values of X if we are to repeat the random experiment over
and over again. Thus, the mean of X can be interpreted as the
average value generated by continually repeating the random
experiment.
Session 2.39
VARIANCE OF A DISCRETE
RANDOM VARIABLE
 If X is a discrete random variable with probability distribution P(X=x)
x : x1 x2 … xn
P(X=x) : P(X=x1) P(X=x2) … P(X=xn)
then the variance of X is:
Var(X)= E(X- µ)2

=(x1- µ)2P(X=x1)+(x2- µ)2P(X=x2)+ …+(xn- µ)2P(X=xn)
 The variance of X is a measure of dispersion. It is the average

squared deviation between the realized value of X and µ. It tends
to have a larger value if the values of X are likely to be far from the
mean (the center of the distribution), than if the values are
concentrated about the mean. If there is no variation in the values
generated by X then Var(X) will be 0.
Session 2.40

Probability Distributions of a Discrete

Random Variable:
e.g. Bernoulli, Binomial, Geometric,
Hypergeometric, Negative Binomial,
Session 2.41
Bernoulli Probability Distribution
 Named after Bernoulli

 Discrete random variable with
only two possible values; 0 and 1
 The value 1 represents success
while the value 0 represents
failure
 The parameter p is the probability
of success.
Session 2.42
Bernoulli Probability Distribution
 Its probability  Graphically, the

distribution function distribution is illustrated
is given by: as follows:
p x 1

P( X  x)  1  p x  0 p
0 otherwise
 1-p
0 1
Session 2.43
Binomial Probability Distribution

 Composed of n independent Bernoulli
trials
 The parameter p is the probability of
success which remains constant from
one trial to another
 The binomial random variable is
defined as the number of successes
out of n trials
 Possible values; 0, 1, 2, .., n
Session 2.44
Binomial Probability Distribution
 Its probability  Graphically, the

distribution function is distribution is illustrated
given by: as follows:
 n x
P( X  x)    p  1  p  , x  0, 1, 2,  n
n x
 x
and the function is 0

elsewhere.
0 1 2 …. n
Session 2.45
Illustration: Binomial Distribution*
The probability that a patient recovers from a rare

blood disease is 0.4. If 15 people are known to have
contracted this disease, what is the probability that
exactly 5 survive.
Solution: Let X be the number of people that survive.

P(X=5) = b(5;15, 0.4) = 15 0.4 5 (1  0.4)10 = 0.1859
5 
 
*from Walpole, Introduction to Statistics Session 2.46

Continuous Random Variable

-PROBABILITY DISTRIBUTION
 The density function of a continuous

random variable X is a curve or a function f
such that P(a ≤ X ≤ b) is the area bounded
by the curve f(x), the x-axis and the lines
x=a and x=b.
 The density function and the range of
values that the random variable can take
define the probability distribution of a
continuous random variable.
Session 2.47
Continuous Random Variable

-PROBABILITY DISTRIBUTION
Probability Distributions of a Continuous

Random Variable:
e.g. Normal, Exponential, Gamma, Beta,
Uniform,
Session 2.48
Normal Probability Distribution

f(X)
• ‘Bell-Shaped’
• Symmetric
• Range of possible values X

is infinite on both
directions. Mean
Median
Mode
Session 2.49
The Mathematical Model

1
 
2
x 
1 

f x  e 2 
2 2

f x : density function of random variable X
  3.14159; e  2.71828
 : population mean
 : population standard deviation
Session 2.50
THE NORMAL CURVE

Two normal distributions with the same mean but
different variances.
0.25
N(5,4)
0.20
0.15
N(5,9)
0.10
0.05
0.00
-15 -10 -5 0 5 10 15 20
Session 2.51
THE NORMAL CURVE
Two normal distributions with the different means

but equal variances
0.25 N(5,4)
0.20
N(10,4)
0.15
0.10
0.05
0.00
-5 0 5 10 15 20
Session 2.52
Many Normal Distributions

There are an infinite number of normal curves
By varying the parameters  and , we obtain

different normal distributions
Session 2.53
Normal Distribution Properties
For a normal curve, the area within:

a) one standard deviation from the
mean is about 68%,
b) two standard deviations from the
mean is about 95%; and
c) three standard deviations from
the mean is about 99.7%.
Session 2.54
Areas Normal Distributions
Probability is the area

under the curve! P (c  X  d ) ?
f(X)
X
c d
Session 2.55
Which Normal Distribution???

The normal distribution is
a family of distributions,
each member defined by
the value of and 
Infinitely Many Normal Distributions
Session 2.56
Standard Normal Distribution
Since there are many normal curves,

often it is important to standardize,
and refer to a STANDARD NORMAL
DISTRIBUTION (or curve) where the
mean  = 0 and the  =1
Session 2.57
THE Z-TABLE
This table summarizes the cumulative probability
distribution for Z (i.e. P[Z  z])
Examples:
1. P[Z  0] = 0.5
2. P[Z  1.25] = 0.8944
3. P[Z  1.96] = 0.9750
0z
P[Z  z]
Session 2.58
Standardizing Example
X   6.2  5
Z   0.12
 10
Normal Distribution Standard Normal Distribution
  10
Z 1
 5 Z  0
6.2 X 0.12 Z
Session 2.59
Shaded Area Exaggerated
Solution: The Cumulative

Standardized Normal Curve
Cumulative Standard Normal Distribution Table (Portion)
Z .00 .01 .02 Shaded Area Z  0 Z 1

Exaggerated
0.0 .5000 .5040 .5080 .5478
0.1 .5398 .5438 .5478 Z = 0.12
0.2 .5793 .5832 .5871

0
0.3 .6179 .6217 .6255 Probabilities
Only One Table is Needed
Session 2.60
Example: P  2.9  X  7.1  .1664
X   2.9  5 X   7.1  5
Z   .21 Z   .21
 10  10
Normal Distribution Standardized Normal Curve
  10 .0832 Z 1
.0832
2.9 7.1 X 0.21 0.21 Z

 5 Z  0
Session 2.61
Example: P  2.9  X  7.1  .1664

(continued)
Cumulative Standard Normal
Distribution Table (Portion)
Z  0 Z 1
Z .00 .01 .02
.5832
0.0 .5000 .5040 .5080
Shaded Area
0.1 .5398 .5438 .5478 Exaggerated
0.2 .5793 .5832 .5871 0

0.3 .6179 .6217 .6255 Z = 0.21
Session 2.62
Example: P  2.9  X  7.1  .1664

(continued)
Cumulative Standard
Normal Distribution Table Z  0 Z 1
(Portion)
Z .00 .01 .02
.4168
-0.3 .3821 .3783 .3745
Shaded Area
Exaggerated
-0.2 .4207 .4168 .4129
-0.1 .4602 .4562 .4522 0

Z = -0.21
0.0 .5000 .4960 .4920
Session 2.63
Example: P  X  8   .3821
X   85
Z   .30
 10
Normal Distribution Standard Normal
  10 Distribution
Z 1
.3821
8 X 0.30 Z
 5 Z  0
Session 2.64
P  X  8   .3821
Example: (continued)
Cumulative Standard Normal
Distribution Table (Portion) Z  0 Z 1
Z .00 .01 .02
.6179
0.0 .5000 .5040 .5080 Shaded Area
Exaggerated
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871 0

Z = 0.30
0.3 .6179 .6217 .6255
Session 2.65
Finding Z Values for Known Probabilities
What is Z Given area between Cumulative Standard Normal

0 and Z is 0.1217 ? Distribution Table (Portion)
Z  0 Z 1 Z .00 .01 0.2
.1217 0.0 .5000 .5040 .5080
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871

0
Z  .31 0.3 .6179 .6217 .6255
Shaded Area
Session 2.66
Exaggerated
Example
Suppose that women’s heights can be modeled by a normal
curve with a mean of 1620 mm and a standard deviation
of 50 mm. What height corresponds to the 10th percentile?
Solution: The 10th percentile of the height distribution

may be obtained by firstly getting the 10th percentile
of the standard normal curve, which can be read off
as -1.282. This means that the 10th percentile of the
height distribution is 1.282 standard deviations below
the mean.
Session 2.67
Example - continuation
From X 
Z

X 1620
1.282 
50
Therefore the height corresponding to the 10th

percentile is
X = 1620 – 1.282(50) =1555.9
Session 2.68
RULES IN COMPUTING PROBABILITIES
P[Z = a] = 0
P[Z  a] can be obtained directly
from the Z-table
P[Z  a] = 1 – P[Z  a]
P[Z  -a] = P[Z  +a]
P[Z  -a] = P[Z  +a]
P[a1  Z  a2] = P[Z  a2] – P[Z  a1]
Session 2.69

Probability and Probability Distributions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Probability Distributions

Uploaded by

Copyright:

Available Formats

PROBABILITY

Motivation for Studying Chance

Sample Statistic Estimates Population Parameter

An Approach to Solve the Questions

If sample was chosen through

Features of a Random Experiment

 All outcomes are known in

 Tree Diagrams Ace

Mutually Exclusive Events

Two events are mutually exclusive if

Computing Probability: A Priori

The probability of an event E can be computed

P(getting an ace) = P(ace of spades) + P(ace of clubs) + P(ace of diamonds)

Computing Probability: A Priori

If all possible outcomes can be listed and

since there are 4 aces in a deck of (52) cards.

Computing Probability: A Priori

If the outcomes have different

Computing Joint Probability

The probability of a joint event, A and B:

E.g. P(Red Card and Ace) = P(ace of diamond) + P(ace of heart)

Computing Joint Probability

P(A and B) = P(A B)

 The random experiment has to be

 The probability of the event is the

 Property 3. Addition Rule

Definition of Conditional Probability

The conditional probability of event B given

P(Ace and Red) 2 / 52 2

Definition of Independent Events

The events A and B are independent if and

This condition is equivalent to saying that

1. Consider the following events in the toss of a single

*from Stat101 Manual, UP Stat, Diliman Session 2.30

*Countable – set is finite or can be mapped to the set Session 2.32

Discrete Random Variable

Discrete Random Variable

Discrete Random Variable

Discrete Random Variable

Discrete Random Variable -

The probability distribution of a discrete

Discrete Random Variable -

The probability distribution of the random variable, X defined

MEAN OF A DISCRETE RANDOM

Then the expected value of X, also referred to as the mean of X is:

 In interpreting the mean of X, the collection of data points that we

Var(X)= E(X- µ)2

 The variance of X is a measure of dispersion. It is the average

Discrete Random Variable -

Probability Distributions of a Discrete

Bernoulli Probability Distribution

 Named after Bernoulli

Bernoulli Probability Distribution

 Its probability  Graphically, the

Binomial Probability Distribution

Binomial Probability Distribution

 Its probability  Graphically, the

and the function is 0

Illustration: Binomial Distribution*

The probability that a patient recovers from a rare

Solution: Let X be the number of people that survive.

*from Walpole, Introduction to Statistics Session 2.46

Continuous Random Variable