Econ 1006 Summary Notes 4

1
200052 INTRODUCTION TO ECONOMIC

METHODS
SUMMARY NOTES - WEEK 4
Required Reading:
Ref. File 4: Sections 4.7 to 4.9
Ref. File 5: Introduction and Sections 5.1 to 5.4
4. PROBABILITY THEORY CONTINUED
4.9 Sampling With and Without Replacement
Definition (Random Sample from a Statistical Population)

A random sample of ‘n’ elements from a statistical
population is such that every possible combination of ‘n’
elements from the population has an equal probability of
being in the sample.
Many experiments involve taking a random sample from a

finite population. If we sample with replacement, we
effectively return each observation to the population
before making the next selection. In this way the
population from which we are sampling remains the same
from one selection to the next; provided sampling is
random, the successive outcomes will be independent.
If we sample without replacement from a finite

population, the outcome of any one selection will depend
on the outcomes of all previous selections; the population
is reduced with each selection.
2
Example 4.16:
Suppose that in a given street 50 residents voted in the last
election. Of these, 15 voted for party ‘A’, 30 voted for
party ‘B’ and 5 voted for neither party ‘A’ nor ‘B’.
Suppose that one evening a candidate for the next election
visits the residents of the street to introduce herself. What
is the probability that the first two eligible voters she
meets voted for party ‘A’ at the last election? ( 3 35 )
Here sampling is without replacement. Define the following

events:
A1 : first person voted for party ‘A’

A 2 : second person voted for party ‘A’
We require
P(A1  A 2 )  P(A1 ) P(A 2 | A1 )  (15 50)(14 49)  3 35
Example 4.17:
Consider the experiment of successively drawing 2 cards
from a deck of 52 playing cards. Define the following
events:
A 1 : ace on first draw

A 2 : ace on second draw
What is the probability of selecting 2 aces if sampling

(drawing) is (i) without replacement, and (ii) with
replacement? ( 1 221 , 1 169 )
3
Without replacement:
12 1
P (A1  A 2 )  P (A1 ) P (A 2 | A1 )  (4 52)(3 51)  
2652 221
With replacement:
1
P(A1  A 2 )  P(A1 ) P(A 2 )  (4 52)(4 52) 
169
Note: If we simultaneously select a sample of ‘n’ elements,

we are effectively sampling without replacement.
4.10 Probability Trees
Tree diagrams can be a useful aid in calculating the

probabilities of intersections of events (i.e. joint
probabilities).
4
Example 4.18:
Greasy Mo’s take-away food store offers special $10 meal
deals consisting of a small pizza or a kebab, together with
a can of soft drink, a milkshake or a cup of fruit juice.
Past experience has shown that 60% of meal deal buyers
choose a pizza (‘P’), 40% choose kebabs (‘K’), 75% choose
softdrink (‘S’), 20% choose a milkshake (‘M’) and 5%
choose fruit juice (‘J’). Assume the events ‘P’ and ‘K’ are
independent of the events ‘S’, ‘M’ and ‘J’. What is the
probability that a meal deal customer (chosen at random)
will choose a pizza and fruit juice? (0.03)
The tree diagram for this example can be drawn as below.

P (P  S)  0.6 (0.75)  0.45
S:0.75
M:0.2 P (P  M)  0.6 (0.2)  0.12
P:0.6 J:0.05
P (P  J)  0.6 (0.05)  0.03
P (K  S)  0.4 (0.75)  0.3

S:0.75
K:0.4
M:0.2 P (K  M)  0.4 (0.2)  0.08
J:0.05
P (K  J)  0.4 (0.05)  0.02
Thus P(P  J)  0.3, etc.

5
5. PROBABILITY DISTRIBUTIONS OF DISCRETE

RANDOM VARIABLES
5.1 Probability Distributions and Random Variables
A probability distribution can be considered a theoretical

model for a relative frequency distribution of data from a
real life population.
For example, the probability distribution normally used

for the experiment of tossing a fair coin once and noting
whether a head (‘H’) or tail (‘T’) results can be written
P (H)  1 2
P ( T)  1 2
This can be interpreted as saying that if the coin tossing

experiment were repeated many times, we would expect a
relative frequency of both outcomes to be a half.
A probability distribution thus specifies the probabilities

associated with the various outcomes of a statistical
experiment. It can take the form of a table, a graph or
some formula.
From now on we shall be concerned with the

characteristics of probability distributions. However, to
facilitate our study we shall now represent simple events
and events associated with statistical experiments by
values of random variables.
6
Definition (Random Variable)

A random variable X is a rule that assigns to each simple
event of a statistical experiment a unique numerical value.
The above definition can also be expressed in the following

slightly more mathematical way.
Alternative Definition (Random Variable)

A random variable X is a real valued function for which
the domain is the sample space of a statistical experiment.
Remember that by the term random experiment we mean

an experiment which gives rise to random outcomes.
In most statistical experiments of interest, outcomes give

rise to quantitative data that can be considered values of
the random variable being studied.
For example, if the experiment consists of selecting a

household at random and noting the number of children
in the household, we would naturally define
X  random variable representing the number of children

in a household.
X could thus take the values 0, 1, 2, 3,.... corresponding to

possible outcomes of the experiment.
In experiments which give rise to categorical or qualitative

data, a random variable can normally also be defined.
7
Example 5.1:
Consider the experiment of selecting a person at random
and noting their hair colour.
Here we could define X to be the random variable

representing hair colour, where
X  1 if the person’s hair colour is blonde

X 2 “ “ brown
X 3 “ “ grey
X 4 “ “ black
X 5 “ “ white
X 6 “ “ red
There are two basic types of random variables.
Definition (Discrete Random Variable)

A discrete random variable can only assume a finite or
infinite and countable number of values.
(By countably infinite we mean that the values can be

listed in order, although the list is infinitely long)
Definition (Continuous Random Variable)

A continuous random variable can assume any value in an
interval (finite or infinite).
8
Some examples of discrete random variables:

the number of errors on a typed page
the number of cars owned by a household
Some examples of continuous random variables:

the length of time between bus arrivals at a bus stop
the weight of an individual
At this stage we will concentrate on discrete random

variables.
Definition (Discrete Probability Distribution)

A discrete probability distribution lists a probability for, or
provides a means (e.g. a rule or formula) of assigning a
probability to, each value a discrete random variable can
take.
Suppose our random variable is called X. Then P ( X  x)

represents the probability that the random variable takes
on the particular value ‘x’. (As a result of the outcome of
an experiment).
Properties of the Discrete Probability Distribution of a

Random Variable X:
 0  P ( X  x)  1 for all values of ‘x’

  P ( X  x)  1
all x
9
Example 5.2:
Consider again the experiment of tossing a fair die once
and noting the number of dots on the upward facing side
(X).
We have
P( X  1)  P( X  2)  P( X  3)
 P( X  4)  P( X  5)  P( X  6)  1 6
and  P( X  x)  1
all x
At this point we can also introduce the concept of a

cumulative distribution function (or simply distribution
function) of a random variable (discrete or continuous).
Definition (Cumulative Distribution Function)

The cumulative distribution function of a random variable
X, denoted F ( x) , is defined as
F ( x)  P ( X  x)
where ‘x’ is any real number.
(In the above definition, ‘x’ represents not just numbers

that the random variable can take).
Thus a cumulative distribution function shows the

probability of the random variable taking on values less
than or equal to some value ‘x’.
10
5.2 Expected Values of Random Variables
It is of interest to have a measure of the centre of the

probability distribution of a random variable X. This role
is filled by the expected value of X.
Definition (Expected Value of a Discrete Random

Variable)
The expected value of a discrete random variable X is
defined as
E ( X )   xP ( X  x)
all x
(A weighted average of all the values X can take)
If a statistical experiment considered generates values of

the random variable that coincide with values in the
population considered, and the theoretical probability
distribution of the random variable and population
relative frequency distribution are the same, the mean of
the theoretical distribution of X will be the same as the
population mean (i.e.  ). That is, E ( X )   .
We will generally assume that our model (i.e. the

probability distribution) is correct, so the above holds.
11
Example 5.3:
Suppose you buy a lottery ticket for $10. The sole prize in
the lottery is $100,000 and 100,000 tickets are sold. If the
lottery is fair (i.e. each ticket sold has an equal chance of
winning), what will be your expected net gain (or loss)
from buying the lottery ticket?
(See video for solution)
Theorem (Expected Value of a Function of a Discrete

Random Variable)
Suppose a function g( X ) of a discrete random variable X.
The expected value of this function, if it exists, is given by
E [ g( X )]   g( x) P ( X  x)
all x
There are several important properties related to expected

values.
12
Theorem 5.2 (Various Properties of Expected Values)
 If ‘c’ is any constant then
E (c )  c
 If ‘c’ is any constant and g( X ) is any function of a

discrete or continuous random variable X then
E [cg( X )]  cE [ g( X )]
 If g i ( X ) ( i  1,...,k ) are ‘k’ functions of a discrete or

continuous random variable X then
E [ g1 ( X )  ..  gk ( X )]  E [ g1 ( X )]  ...  E [ gk ( X )]
 If h( X ) and g( X ) are two functions of a discrete or

continuous random variable X such that h( X )  g( X )
for all X, then
E [h ( X )]  E [ g( X )]
For example, E ( X  Y )  E ( X )  E (Y )
Note:
Two discrete random variables X and Y are independent if
P ( X  x | Y  y )  P ( X  x) for all values of x and y.

13
(or equivalently P (Y  y | X  x)  P (Y  y ) for all values

of x and y)
5.3 The Variance of a Random Variable
To gauge the dispersion of a random variable X about its

expected value or mean we can calculate the expected
value of its squared distance ( X  E ( X ))2 from the mean.
This is called the variance of the random variable X,
denoted Var ( X ) .
Definition (Variance of a Random Variable)

The variance of any random variable X (discrete or
continuous) is given by
Var ( X )  E [( X  E ( X )) 2 ]
If X is a discrete random variable that can take ‘n’

different values ( x1 , x 2 ,...,x n ) , the above definition
specializes to
n
Var ( X )   [x i  E ( X )]2 P ( X  x i )
i 1
Definition (Standard Deviation of a Random Variable)

The standard deviation of any random variable X (discrete
or continuous) is given by
SD ( X )  Var ( X )  E [( X  E ( X )) 2 ]
14
Again assuming the probability distribution of X is an

accurate representation of the population relative
frequency distribution of X, we can write Var ( X )   2 ,
where  2 is the population variance.
An alternative way of writing (and calculating) Var ( X ) is
Var ( X )  E ( X 2 )  [ E ( X )]2
 
   x 2 P ( X  x)  [ E ( X )]2 (If X is discrete)
 all x 
Example 5.4:
Suppose a lottery offers 3 prizes: $1,000, $2,000 and
$3,000. 10,000 tickets are sold and each ticket has an
equal chance of winning a prize. Calculate the variance
and standard deviation of the random variable X
representing the value of the prize won by a ticket.

15
If we wish to determine the variance of a linear function

Y  g( X )  a  bX of a random variable X, the following
rule can be used
Var (Y )  Var (a  bX )  b 2Var ( X )
5.4 The Binomial Distribution
The binomial distribution is a discrete probability

distribution based on ‘n’ repetitions of an experiment
whose outcomes are represented by a Bernoulli random
variable.
(a) Bernoulli Experiments
A Bernoulli experiment (or trial) is such that only 2

outcomes are possible. These outcomes can be denoted
success (‘S’) and failure (‘F’), with probabilities ‘p’ and
(1  p) , respectively.
A Bernoulli random variable Y is usually defined so that it

takes the value 1 if the outcome of a Bernoulli experiment
is a success, and the value 0 if the outcome is a failure.
16
Thus
P (Y  1)  p
P (Y  0)  (1  p )
The mean and variance of a Bernoulli random variable

defined in the above way are
E (Y )  p
Var (Y )  p(1  p )
An example of a Bernoulli experiment is the tossing of a

fair coin, denoting a head a success (Y  1) and a tail as a
failure (Y  0) , with p  1 2 .
(b) Binomial Experiments
Definition (Binomial Experiment)

A binomial experiment fulfils the following requirements:
(i) There are ‘n’ repetitions or ‘trials’ of a Bernoulli

experiment for which there are only two
outcomes, ‘success’ or ‘failure’.
(ii) All trials are performed under identical
conditions.
(iii) The trials are independent.
(iv) The probability of success ‘p’ is the same for each
trial.
(v) The random variable of interest, say X, is the
number of successes observed in the ‘n’ trials.
17
Theorem (The Binomial Probability Function)

Let X represent the number of successes in a binomial
experiment consisting of ‘n’ trials and with a probability
‘p’ of success on each trial. The probability of ‘x’
successes in such an experiment is given by
P ( X  x)  nCx p x (1  p)n x for x  0,1,2,3,...,n
(See reference file for proof if interested)
Example 5.5:
A company that supplies reverse-cycle air conditioning
units has found from experience that 70% of the units it
installs require servicing within the first 6 weeks of
operation. In a given week the firm installs 10 air
conditioning units. Calculate the probability that, within 6
weeks
 5 of the units require servicing
 none of the units require servicing
 all of the units require servicing

18
(c) Cumulative Binomial Probabilities
The calculation of cumulative binomial probabilities of the

form P ( X  c) is often tedious, even using a calculator.
However, tables to determine such probabilities are
available. (See Reference files appendix Table 3)
(Extract of Appendix 3)
CUMULATIVE BINOMIAL PROBABILITIES: P ( X  x | p, n)
p
n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 .... 0.70
1 0 0.9500 0.9000 0.8500 0.8000 0.7500 0.7000 0.6500 0.6000 .... 0.3000
1 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.0900

1 0.9975 0.9900 0.9775 0.9600 0.9375 0.9100 0.8775 0.8400 .... 0.5100
2 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.0270

1 0.9928 0.9720 0.9393 0.8960 0.8438 0.7840 0.7183 0.6480 .... 0.2160
2 0.9999 0.9990 0.9966 0.9920 0.9844 0.9730 0.9571 0.9360 0.6570
3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
           
10 0 0.5987 0.3487 0.1969 0.1074 0.0563 0.0282 0.0135 0.0060 0.0000
1 0.9139 0.7361 0.5443 0.3758 0.2440 0.1493 0.0860 0.0464 0.0001
2 0.9885 0.9298 0.8202 0.6778 0.5256 0.3828 0.2616 0.1673 0.0016
3 0.9990 0.9872 0.9500 0.8791 0.7759 0.6496 0.5138 0.3823 0.0106
4 0.9999 0.9984 0.9901 0.9672 0.9219 0.8497 0.7515 0.6331 0.0473
5 1.0000 0.9999 0.9986 0.9936 0.9803 0.9527 0.9051 0.8338 .... 0.1503
6 1.0000 1.0000 0.9999 0.9991 0.9965 0.9894 0.9740 0.9452 0.3504
7 1.0000 1.0000 1.0000 0.9999 0.9996 0.9984 0.9952 0.9877 0.6172
8 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.9983 0.8507
9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9718
10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
19
Example 5.6:
Referring to previous air conditioning unit example,
calculate the probability that within 6 weeks of installation
 less than 8 of the air conditioners require servicing.

 4 or more of the air conditioners require servicing.

20
Example 5.7:
A referring to previous air conditioning unit example, use
the cumulative binomial tables to calculate the probability
that within 6 weeks of installation
 5 units require servicing

 10 units require servicing

21
(d) Characteristics of the Binomial Distribution
Theorem (Mean and Variance of a Binomial Random

Variable)
Let X represent the number of successes in a binomial
experiment consisting of ‘n’ trials, and where the
probability of success on each trial is ‘p’. Then
E ( X )  np
Var ( X )  np(1  p)
For example, the mean and variance of the binomial

distribution of the previous air conditioning unit example
are 10(0.7)  7 and 10(0.7)(0.3)  2.1, respectively.
Each combination of ‘n’ and ‘p’ gives a particular

binomial distribution. We say ‘n’ and ‘p’ are the
parameters of the binomial distribution.
If p  0.5 , the binomial distribution is symmetric.

22
Example 5.8:
Suppose n  5 and p  0.5
(probability histogram)
probability
0.3125
0.1563
0.0313
0 1 2 3 4 5 X
The binomial distribution will be skewed to the left (i.e.

‘negatively skewed’) if p  0.5 , and skewed to the right
(i.e. ‘positively skewed’) if p  0.5 . In either case the
tendency to be skewed diminishes as ‘n’ increases.
(See the diagrams in reference file). This is a

characteristic which is useful in approximating binomial
probabilities as we shall see later.
23
MAIN POINTS
 If we sample without replacement from a finite

population, the outcome on any draw will depend on the
outcomes of all previous draws.
 Sampling with replacement from a finite population is

‘equivalent’ to sampling from an infinite population.
 Tree diagrams can facilitate the calculation of joint

probabilities (i.e. the probabilities of intersections of
events).
 A probability distribution can be interpreted as a model

for the relative frequency distribution of some real
statistical population. In any given situation, the model
may or may not represent the relative frequency
distribution exactly.
 It is convenient to associate the outcomes of a statistical

experiment with values of a random variable (e.g. X).
We can then think in terms of the probability
distribution of the random variable.
 The mean (expected value) and variance of a discrete

random variable are given by
E ( X )   x P ( X  x) ( )
all x
Var ( X )  E [( X  E ( X )) 2 ]
  ( x  E ( X )) 2 P ( X  x) (  2 )
all x
24
 The binomial distribution is a model for the relative

frequency (probability) distribution of numbers of
successes in ‘n’ trials of a Bernoulli experiment.
 The binomial distribution can be represented by the

probability function
P ( X  x)  C p x (1  p)n  x
n x
where ‘n’ is the number of trials, ‘x’ the number of

successes and ‘p’ the probability of success at each trial.

Econ 1006 Summary Notes 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econ 1006 Summary Notes 4

Uploaded by

Copyright:

Available Formats

1

200052 INTRODUCTION TO ECONOMIC

SUMMARY NOTES - WEEK 4

4. PROBABILITY THEORY CONTINUED

4.9 Sampling With and Without Replacement

Definition (Random Sample from a Statistical Population)

Many experiments involve taking a random sample from a

If we sample without replacement from a finite

Here sampling is without replacement. Define the following

A1 : first person voted for party ‘A’

P(A1  A 2 )  P(A1 ) P(A 2 | A1 )  (15 50)(14 49)  3 35

A 1 : ace on first draw

What is the probability of selecting 2 aces if sampling

Note: If we simultaneously select a sample of ‘n’ elements,

4.10 Probability Trees

Tree diagrams can be a useful aid in calculating the

The tree diagram for this example can be drawn as below.

M:0.2 P (P  M)  0.6 (0.2)  0.12

P (K  S)  0.4 (0.75)  0.3

Thus P(P  J)  0.3, etc.

5. PROBABILITY DISTRIBUTIONS OF DISCRETE

5.1 Probability Distributions and Random Variables

A probability distribution can be considered a theoretical

For example, the probability distribution normally used

This can be interpreted as saying that if the coin tossing

A probability distribution thus specifies the probabilities

From now on we shall be concerned with the

Definition (Random Variable)

The above definition can also be expressed in the following

Alternative Definition (Random Variable)

Remember that by the term random experiment we mean

In most statistical experiments of interest, outcomes give

For example, if the experiment consists of selecting a

X  random variable representing the number of children

X could thus take the values 0, 1, 2, 3,.... corresponding to

In experiments which give rise to categorical or qualitative

Here we could define X to be the random variable

X  1 if the person’s hair colour is blonde

There are two basic types of random variables.

Definition (Discrete Random Variable)

(By countably infinite we mean that the values can be

Definition (Continuous Random Variable)

Some examples of discrete random variables:

Some examples of continuous random variables:

At this stage we will concentrate on discrete random

Definition (Discrete Probability Distribution)

Suppose our random variable is called X. Then P ( X  x)

Properties of the Discrete Probability Distribution of a

 0  P ( X  x)  1 for all values of ‘x’

At this point we can also introduce the concept of a

Definition (Cumulative Distribution Function)

where ‘x’ is any real number.

(In the above definition, ‘x’ represents not just numbers

Thus a cumulative distribution function shows the

5.2 Expected Values of Random Variables

It is of interest to have a measure of the centre of the

Definition (Expected Value of a Discrete Random

(A weighted average of all the values X can take)

If a statistical experiment considered generates values of

We will generally assume that our model (i.e. the

(See video for solution)

Theorem (Expected Value of a Function of a Discrete

There are several important properties related to expected