You are on page 1of 69

ECON 1005 Lectures

Probability Distributions

1
Introduction
• In the last lecture, we assigned probabilities to
events.

• The social scientist seldom works with events


arising from an experiment.

• Instead, focus is placed on the random


variable that arises out of the experiment.
2
A Game
• Suppose that you are a guest at the party of
some wealthy man, and the following game is
played: you are invited to throw two dice, and
you are paid the score of the two dice.

• Activity: Define the sample space for this


experiment

3
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
4
The Payouts of this Game
• You are paid the sum of the two dice:
– (2, 1) = $3
– (4, 2) = $6
– and so on

• What is the smallest expected payout?

• What is the largest expected payout?


5
• For each outcome in the sample space, we can
associate a (real) dollar value equal to the
player’s winnings if this possible outcome
were to materialize.

• Let us denote the value of the winnings in any


game by X.

• What are the possible values of X?

6
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
X=2 X=3 X= 4 X=5 X=6 X=7
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
X=3 X=4 X=5 X=6 X=7 X=8
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
X=4 X=5 X=6 X= 7 X=8 X=9
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
X=5 X=6 X=7 X = 8 X = 9 X = 10
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
X=6 X=7 X=8 X = 9 X = 10 X = 11
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
X=7 X=8 X=9 X = 10 X = 11 X = 12
7
Random Variables

• In general, it is usually possible to convert points in


the sample space into real values.

• The precise values cannot be known in advance


since they depend on outcomes which are
themselves random.

• It is for this reason that variables like X are called


random variables.

8
Random Variables
• In more formal terms, a random variable is defined to
be a function that “...assigns exactly one value to each
point in a sample space for an experiment”

• Above, we implicitly defined our random variable to be


the amount of money we receive in the game.

• The nature of the assignment gives the random


variable its name. In our experiment above, X is the
prize money from the game of tossing two ‘fair’ dice.

• Random Variables form the core of Inferential


Statistics.

9
Random Variables

Recall that there are two types of data:


• Discrete data
• Continuous data

Similarly there are two types of random variables:


– Discrete random variables
– Continuous random variables

10
Discrete v Continuous Random Variables
• In the above example, our random variable X could
assume any one of eleven possible values.
• This is an example of what is known as a discrete
random variable.
• A discrete random variable is one which can take
precise values on the real line. It assumes
countable values, where the consecutive values are
separated by a certain gap.
• Conversely, a continuous random variable is one
which can assume any value contained in one or
more intervals on the real number line.
11
Class Activity - Indicate which of the following random
variables are discrete and which are continuous?

• The number of new accounts opened at a bank during


a certain month

• The time taken to run a marathon

• The price of a concert ticket

• The runs scored in a cricket game

• The age of a house


12
Back to Our Example:
Some Probability Questions
• In this example, we are not particularly interested in
the probability of the individual outcomes (which are
all equal to 1/36)

• We are instead interested in the probability of the


different values obtained by summing the numbers
on the two dice which determine how much we are
to be paid.

• Suppose, for instance, I were to ask: what is the


probability of a player in this game winning $5.00?
13
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
14
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
15
• How must the dice roll for the player to win $5?

• This happens with outcomes (1, 4), (2, 3), (3, 2)


and (4, 1)

• What is the probability of getting one of these


outcomes?
No of favourable outcomes = 4
Total no of all possible outcomes 36

• Therefore the probability of winning $5 is 4/36,


or we can say P(X= 5) = 4/36
16
Activity
• What is the probability of the player winning
$2? In other words, what is the probability
that our random variable X, defined as the
value of the winnings, takes a value of 2?

• What is the probability of the player winning


$12? In other words, Pr (X = 12) = ?

17
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
18
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)

P(X = 2) = 1 / 36
19
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
20
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (6, 1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1, 3) (2, 3) (3, 3) (4, 3) (5, 3) (6, 3)
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1, 5) (2, 5) (3, 5) (4, 5) (5, 5) (6, 5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)

P(X = 12) = 1 / 36
21
• So, we define the random variable X as the
winning of this game
• We have worked out that X can take values of
1 to 12
• We can also work out the probability that X
takes any of these values
• Let us represent these in a Table.

X 2 3 4 5 6 7 8 9 10 11 12
Pr
1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
(X=?)

22
Probability Distributions
• A probability distribution links a random variable
X with the probability that X assumes a discrete
value or a range of values

• This can be presented by a table, function or


formula

• Random variables can be discrete or continuous

• Probability distributions are also correspondingly


discrete or continuous
23
Discrete v Continuous Probability Distributions
• Every random variable has associated with it a
probability distribution.
• As you may imagine, discrete random variables
possess discrete probability distributions, and
continuous random variables possess continuous
probability distributions.
• As both types of random variables can be
distinguished by their characteristics, so too can
both types of probability distributions be
distinguished.
24
Properties of
Discrete Probability Distributions
• There are two characteristics that the
probability distribution of a discrete random
variable must possess

• For any value x on the real line:


(1) Pr(X=x) 0
(2) Pr(X=x) = 1

25
Properties of
Continuous Probability Distributions
• Let f(x) be the probability distribution of a
continuous random variable.

• Then
(1) f(x) 0 for each value of x
(2)

• The second criterion means that the area


under the curve of f(x), the probability density
function, is equal to 1. 26
Probability Distribution for a Discrete Random
Variable
• Such a table presents all values taken on by
the random variable and their corresponding
probabilities.

• Furthermore, the probabilities sum to one.

• Such a table is called a probability distribution


for our random variable X.
27
Another Example: A Card Game
• Consider the experiment of tossing a ‘fair’
coin twice.

• Suppose that this experiment forms part of a


game during a lime among some friends.

• The rules of the game specify that each


outcome of a ‘Head’ entitles the player to a
$2.00 prize while each outcome of a ‘Tail’
results in the player paying out $2.00.
28
Our Card Game
• Each time that the experiment is repeated the
player stands to get net prize money of $2.00
or $0.00 or pay out $2.00.

• We may not, however, know the precise prize


money in advance as these moneys all depend
on outcomes that are also random.

29
Activity
• Define the random variable X as the winnings
of the game

• Create the probability distribution for X

• In other words, create a table where you


enumerate all possible values of X, and assign
probabilities to those values

30
Activity
• What is the sample space for this experiment?
(HH, HT, TH, TT)
• Obtaining H wins the player $2, obtaining T loses the player $2
• Therefore, we can assign a net prize money to each outcome in
the sample space.

• HH
$2.00 + $2.00 = $4.00
• HT
$2.00 - $2.00 = $0.00
• TH
-$2.00 + $2.00 = $0.00
• TT
-$2.00 - $2.00 = -$4.00 31
Activity
• What are all possible values of X? $-4, 0, 4

• What are the probabilities assigned to each of these


values?

• Consider P(X = 4). This happens when the player obtains


(H,H)

• What is the probability that this occurs? ¼

• Work out P(X = -4)

• Work out P(X = 0) 32


Activity
• Consider P(X = -4)
• This happens when the player gets (T,T)
• The probability that this occurs is ¼

• Consider P(X = 0)
• This happens two times: when the player gets
(H,T) or (T,H)
• Therefore, the probability that this occurs is
2/4

33
Probability Distribution
Net Prize X Outcome(s) Probability

$-4 TT 0.25

$0 HT, TH 0.25 + 0.25 = 0.50

$4 HH 0.25

34
Activity
• Each of the tables below lists certain values x of a
discrete random variable X. Determine whether or not
each table is a valid probability distribution.

Table 2 Table 2 Table 3


x P(x) x P(x) x P(x)
1 .08 1 .15 7 .70
2 .22 2 .34 8 .50
3 .39 3 .28 9 -.20
4 .27 4 .23

35
Charting the Discrete Probability Distribution.
• So far we have presented probability
distributions in tabular form
• It seems logical that we can also present them in
graphical form, charting the values of the
random variable against the probability that the
random variable assumes those values
• We can proceed to show a chart of the only
probability distribution among the three tables.
• We call such a chart the graph of the probability
distribution.

36
Graph of a Discrete Probability Distribution.

P(x)

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
1 2 3 4

37
What about the Discrete Cumulative
Probability Distribution?
Cumulative Probabilities can be of the form:
• P( X < a )
• P( X a )
• P( X > a )
• P( X a )
• P( a < X < b )
• P( a > X > b )
• P( a X b )
• P( a X b )
38
Example of a Discrete Cumulative Probability
Distribution
Probability Cumulative Probability
Distribution Distribution

x P(x) x P(X < x)

1 .15 1 .15
2 .34 2 .49
3 .28 3 .77
4 .23 4 1.00

39
Graph of the Discrete Cumulative Probability
Distribution
Cumulative Probability

1.2

0.8

0.6

0.4

0.2

0
1 2 3 4

40
Back to the Example
X -4 0 4
P(X = x) 1/4 2/4 1/4
• Calculate the following probabilities:
• P(X < 4)
• P(X > $0)
• P( X $4)
• P(X = 2)
• P(X < -1)
41
X -4 0 4
P(X = x) 1/4 2/4 1/4
• P(X < 4)
= P(X = $0 or -$4) (Remember your Addition Rule?)
= .5 + .25 = .75
• P(X > $0)
= P(X =$4) = 0.25
• P( X $4)
= P(X = -$4 or $0 or $4)
= 0.25 + 0.50 +0.25 = 1
• P(X = $2)
=0
• P(X < -$1)
42
= P(X = $4) = 0.25
Probability Distribution for a Continuous
Random Variable
• We define the probability distribution for continuous
random variables differently.

• Recall that we cannot count the values assumed by a


continuous random variable.

• The number of values taken on by the variable in


any interval is infinite.

• Accordingly, we modify the approach used for the


discrete random variables.
43
Definitions
• The probability distribution function of a continuous
random variable is called a probability density
function

• The graph of this function is called a probability


density curve

• The probability that the continuous random variable


lies between any two given values a and b (i.e.
P(a<X<b) is given by the area under the probability
density curve bounded by the lines x = a and x = b.
44
Area under the Density Curve

• Recall that the heights of the bars in the relative


frequency histogram sum to 1. Hence the area under the
histogram is 1.

• The relative frequency polygon has an area that


approximates the area of the histogram i.e. the area
under the polygon approximates to 1.

• The smoothed relative frequency polygon is the area


under the probability density curve. Hence the area
under the curve also equals 1.
45
Probability for Continuous Random Variables

• Hence all probabilities will lie in the range of 0 to


1 inclusive

• Also, the probability that the random variable will


assume all possible intervals of values equals the
entire area under the curve i.e. an area of 1.

• The axioms of probability hold.

46
Probability for Continuous Random Variables

Further the probability that the continuous


random variable X assumes a single value is
seen to be the area of a bar with zero width.
i.e. such an area equals zero.

In other words, for continuous random


variables;
P(X = a) = 0 and P(X = b) = 0

47
Generic Probabilities for Continuous Random Variables
Hence we refer to these generic probabilities:
• P( X < a )
• P( X > a )
• P( a < X < b ).
In addition, note that:
• P( a < X < b ) = P( a < X < b )
• P( a < X < b ) = P( a < X < b )
• P( a < X < b ) = P( a < X < b )

We will come back to this in the next lecture.


48
Expected Value Of A Discrete Random Variable
X -4 0 4
P(X = x) 1/4 2/4 1/4

• Recall the example of tossing a coin twice, and being paid $2


for H and losing $2 for T
• Suppose that we played this game repeatedly, say 4000 times.
• From the probability distribution we can say that:
– Net prize money of -$4 is expected in 25% of the games i.e.
in 1000 games
– Net prize money of $0 is expected in 50% of the games i.e.
in 2000 games
– Net prize money of $4 is expected in 25% of the games i.e.
in 1000 games.

49
Expected Value Of A Discrete Random Variable
What then would be our average net prize money?
Avg. Net Prize Money = Total Net Prize Money
No. of games played

= 1000 (-$4) + 2000($0) + 1000($4)


4000

Simplifying we get the Avg. Net Prize Money to be


equal to
1 (-$4) + 2($0) + 1($4) = $0
4 4 4
50
Expected Value Of A Discrete Random Variable
• We can conclude that if we played the card
game a large number of times, on average, we
can expect to win nothing (and, by extension,
lose nothing).

• The value of $0 so computed is called the


expected value E(X) of the discrete random
variable X.

• What does this mean?


51
Expected Value Of A Discrete Random Variable
• In short, E(X) is called the long run average
value of the random variable.

• We can calculate E(X) from a discrete


probability distribution by multiplying each
value of the random variable X by its
corresponding probability and sum the
resulting products.
E(X) = xi P(xi)

• E(X) is also seen as the mean of the discrete


random variable X. 52
Activity
• A random variable X assumes the values of –1, 0, 1
and 2 only. The probability that X assumes the value
x is given by
P(X = x) = ( x – 3) 2
30

• What kind of random variable is X?

• Show that the information above constitutes a


Probability Distribution.

• Find the mean value of X. 53


• X is a discrete random variable, since it takes discrete values
(-1, 0, 1, 2)
P(X = x) = ( x – 3) 2
30
• Therefore:
– P(X = -1) = 16/30
– P(X = 0) = 9/30
– P(X = 1) = 4/30
– P(X = 2) = 1/30

X -1 0 1 2
P(X = x) 16/30 9/30 4/30 1/30

• Does this constitute a probability distribution? All


probabilities lie between 0 and 1, and they all sum to 1, so
YES. 54
X -1 0 1 2
P(X = x) 16/30 9/30 4/30 1/30

• The mean or expected value of X is:


E(X) = xi P(xi)

• E(X) =
(-1)(16/30) + (0)(9/30) + (1)(4/30) + (2)(1/30)
= -16/30 + 0 + 4/30 + 2/30
= -10/30
55
Expected Value Of A Continuous Random Variable
• Recall the fundamental difference in the
characterization of probability for a continuous
variable i.e. it is area under the probability
density curve.

• How then do we find and interpret E(X) for


continuous variables?

56
Expected Value Of A Continuous Random Variable
E(x) is computed using Integration.
+
E(X) = x f(x) dx
-
where f(x) is the probability density function
of the random variable X.

We will come back to this in the next lecture.

57
Properties Of Expectation
Let X be any random variable. X can be discrete
or continuous.

• If a is a constant then E(a) = a

• If a is a constant then E(aX) = a E(X)

• If b is a constant then E(X + b) = E(X) + b

• If a & b are constants then E(aX + b) = a E(X) + b 58


Properties Of Expectation

• If X and Y are two distinct random variables


then E( X + Y) = E(X) + E(Y)

• If g(X) and h(X) are two distinct functions


defined on X then
E[g(X) + h(X)] = E[g(X)] + E[ h(X)]

59
Variance Of Discrete Random Variables
• E(X) has been shown to be equal to the mean of the
random variable
• The Mean E(X) highlights where the probability
distribution of X is centred
• There should be an associated measure of dispersion
for the variable since the mean alone does not give
an adequate description of the shape of the
distribution.
• Unsurprisingly, variance is that measure
• It is a measure of how the values of the variable X
are spread out or dispersed from the mean
60
Variance Of Discrete Random Variables
Consider a discrete random variable X taking on
values x2 , x2 , x3 , …… xn with associated
probabilities p2 , p2 ,p3 , …… pn respectively.

• Variance can be rewritten as


P2 (x2 - )2 + P2 (x2 - )2 + …+ Pn (xn - )2
which simplifies to Pi (xi - )2.

• Standard Deviation of X is the square root of the


Variance of X.
61
Variance Of Continuous Random Variables
• Consider a continuous random variable X with
density function f(x)

+
Variance = f(x) (x - )2 dx
-

Standard Deviation of X is the square root of the


Variance of X.

62
Properties Of Variance
Let X be any random variable. X can be discrete
or continuous.

• If a is a constant then Var(a) = 0.


• If a is a constant then Var(aX) = a2 Var(X).
• If b is a constant then Var(X + b) = Var(X).
• If a and b are constants, Var(aX + b) = a2 Var(X)
• If X and Y are two independent random
variables then Var( X + Y) = Var(X) + Var(Y).
63
Class Activity
Find the mean and variance of the discrete
random variable X whose probability
distribution is given by
x P(x)
0 1/8
1 3/8
2 3/8
3 1/8

64
Activity: Mean
N
X i P( X i )
i 1

X P(X) 0 3/8 6/8 3/8


N
12 1
E( X ) xi P( X xi ) 1
i 1 8 2
65
Activity:Variance

X- -2 2/2 -2/2 2/2 2 2/2


(X - )2 2 2/4 2/4 2/4 2 2/4
(X - )2P(X) 9/32 3/32 3/32 9/32

N
2 2 9 3 3 9 3
(Xi ) P( X i )
i 1 32 32 32 32 4
66
Activity: Try it at home
• Remember the example at the beginning of
the class
• you are a guest at the party of some wealthy
man, and the following game is played: you
are invited to throw two dice, and you are
paid the score of the two dice.
• Calculate your expected winnings from this
game
• Calculate the variance of these winnings

67
Some Special Probability Distributions
• Next week we will look at some special
discrete and continuous probability
distributions

• Two special discrete distributions are the


Binomial Distribution and the Poisson
Distribution

• One special continuous distribution, probably


the most famous one in Statistics, is the
Normal Distribution 68
End of Lecture
• This material is covered in the PS Mann
Chapter 5

• Please ensure that you revise this material


before next week’s class. It is VITAL that you
do so!

69

You might also like