You are on page 1of 21

Type author name/s here

Dougherty

Introduction to Econometrics,
5th edition
Chapter heading
Review: Random Variables,
Sampling, Estimation, and
Inference

© Christopher Dougherty, 2016. All rights reserved.


THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

Suppose we have a random variable X and we wish to estimate its unknown population
mean mX. Our first step is to take a sample of n observations {X1, …, Xn}.

1
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

Before we take the sample, while we are still at the planning stage, the Xi are random
quantities. We know that they will be generated randomly from the distribution for X, but
we do not know their values in advance.
2
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

So now we are thinking about random variables on two levels: the random variable X, and
the sample observations drawn randomly from its distribution.

3
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Actual sample of n observations x1, x2, ..., xn: realization

mX x1 X1 x2 mX X2 mX xn Xn

Once we have taken the sample we will have a set of numbers {x1, …, xn}. This is called by
statisticians a realization. The lower case is to emphasize that these are particular
numbers, not variables.
4
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

We base our plan on the potential distributions. Having generated a sample of n


observations {X1, …, Xn}, we plan to use them with a mathematical formula to estimate the
unknown population mean mX.
5
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Sample of n observations X1, X2, ..., Xn: potential distributions


1
Estimator: X   X 1  ...  X n 
n

mX X1 mX X2 mX Xn

This mathematical formula is known as an estimator. In this context, the standard (but not
only) estimator is the sample mean. An estimator is a random variable because it depends
on the random quantities {X1, …, Xn}.
6
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

Random variable X with unknown population mean mX


probability density
function of X

mX X

Actual sample of n observations x1, x2, ..., xn: realization


1
Estimate: x   x1  ...  xn 
n

mX x1 X1 x2 mX X2 mX xn Xn

The actual number that we obtain, given the realization {x1, …, xn}, is known as our
estimate.

7
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

probability density probability density


function of X function of X

mX X mX X

We will see why these distinctions are useful and important in a comparison of the
distributions of X and X. We will start by showing that X has the same mean as X.

8
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
E  X   E   X 1  ...  X n    E  X 1  ...  X n 
n  n
1
  E  X 1   ...  E  X n  
n
1
   X  ...   X    X
n
Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

We start by replacing X by its definition and then using expected value rule 2 to take 1/n out
of the expression as a common factor.

9
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
E  X   E   X 1  ...  X n    E  X 1  ...  X n 
n  n
1
  E  X 1   ...  E  X n  
n
1
   X  ...   X    X
n
Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

Next we use expected value rule 1 to replace the expectation of a sum with a sum of
expectations.

10
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
E  X   E   X 1  ...  X n    E  X 1  ...  X n 
n  n
1
  E  X 1   ...  E  X n  
n
1
   X  ...   X    X
n
Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

Now we come to the bit that requires thought. Start with X1. When we are still at the
planning stage, before we draw a particular sample, X1 is a random variable and we do not
know what its value will be.
11
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
E  X   E   X 1  ...  X n    E  X 1  ...  X n 
n  n
1
  E  X 1   ...  E  X n  
n
1
   X  ...   X    X
n
Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

All we know is that it will be generated randomly from the distribution of X. The expected
value of X1, as a beforehand concept, will therefore be mX. The same is true for all the other
sample components, thinking about them beforehand. Hence we write this line.
12
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
E  X   E   X 1  ...  X n    E  X 1  ...  X n 
n  n
1
  E  X 1   ...  E  X n  
n
1
   X  ...   X    X
n
Sample of n observations X1, X2, ..., Xn: potential distributions

mX X1 mX X2 mX Xn

Thus we have shown that the mean of the distribution of X is mX.

13
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

probability density probability density


function of X function of X

mX X mX X

We will next demonstrate that the variance of the distribution of X is smaller than that of X,
as depicted in the diagram.

14
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
  var   X 1  ...  X n    2 var X 1  ...  X n 
2
X
n  n
1
 2  var X 1   ...  var X n  
n
 X2
 2  X  ...   X   2  n X  
1 2 2 1 2

n n n
Sample of n observations X1, X2, ..., Xn: potential distributions
variance  X2 variance  X2 variance  X2

mX X1 mX X2 mX Xn

We start by replacing X by its definition and then using variance rule 2 to take 1/n out of the
expression as a common factor.

19
15
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
  var   X 1  ...  X n    2 var X 1  ...  X n 
2
X
n  n
1
 2  var X 1   ...  var X n  
n
 X2
 2  X  ...   X   2  n X  
1 2 2 1 2

n n n
Sample of n observations X1, X2, ..., Xn: potential distributions
variance  X2 variance  X2 variance  X2

mX X1 mX X2 mX Xn

Next we use variance rule 1 to replace the variance of a sum with a sum of variances. In
principle there are many covariance terms as well, but they are zero if we assume that the
sample values are generated independently.
16
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
  var   X 1  ...  X n    2 var X 1  ...  X n 
2
X
n  n
1
 2  var X 1   ...  var X n  
n
 X2
 2  X  ...   X   2  n X  
1 2 2 1 2

n n n
Sample of n observations X1, X2, ..., Xn: potential distributions
variance  X2 variance  X2 variance  X2

mX X1 mX X2 mX Xn

Now we come to the bit that requires thought. Start with X1. When we are still at the
planning stage, we do not know what the value of X1 will be.

17
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
  var   X 1  ...  X n    2 var X 1  ...  X n 
2
X
n  n
1
 2  var X 1   ...  var X n  
n
 X2
 2  X  ...   X   2  n X  
1 2 2 1 2

n n n
Sample of n observations X1, X2, ..., Xn: potential distributions
variance  X2 variance  X2 variance  X2

mX X1 mX X2 mX Xn

All we know is that it will be generated randomly from the distribution of X. The variance of
X1, as a beforehand concept, will therefore be sX2. The same is true for all the other sample
components, thinking about them beforehand. Hence we write this line.
18
THE DOUBLE STRUCTURE OF A SAMPLED RANDOM VARIABLE

1  1
  var   X 1  ...  X n    2 var X 1  ...  X n 
2
X
n  n
1
 2  var X 1   ...  var X n  
n
 X2
 2  X  ...   X   2  n X  
1 2 2 1 2

n n n
Sample of n observations X1, X2, ..., Xn: potential distributions
variance  X2 variance  X2 variance  X2

mX X1 mX X2 mX Xn

Thus we have demonstrated that the variance of the sample mean is equal to the variance
of X divided by n, a result with which you will be familiar from your statistics course.

19
Copyright Christopher Dougherty 20126

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section R.5 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e//.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2015.12.17

You might also like