Professional Documents
Culture Documents
Introduction
What is econometrics?
Econometrics is the application of statistics and economic theory to
data in order to test economic hypotheses.
Economic theory describes relationships between economic variables.
For example, the law of demand tells us that as prices go down, the
quantity demanded will go up.
However, as the owner of a firm or as a policymaker, we are often
interested in the magnitude of the relationship between two variables.
For example, if cigarette taxes increase, the quantity demanded falls.
By how much? What will be the impact on tax revenues?
To answer these questions, we need to know something about the
empirical relationship between cigarette prices and cigarette demand.
1
Years of Education
Example
In 1973, the Indonesian government decided that it was important to
provide equity across the countrys provinces.
Indonesia undertook a massive schooling building program in which
over 61,000 primary schools were built within the next six years.
The intent of the program was to target new schools in areas where
enrollments were previously low which was likely due, in part, to the
long distances students had to travel to attend school.
Between 1973 and 1978, the school enrollment rates of 7 to 12 year old
Indonesians rose from 69 percent to 83 percent.
From the perspective of whether or not the program increased
education levels in Indonesian, it appears to have been successful.
In addition, we can also use economic theory to think about how the
program might impact those who were not directly affected by it.
An increase in the supply of educated workers will shift the labor
supply curve and therefore lead to a new lower equilibrium wage which
will indirectly affect those born before the school building program.
Duflo (2004) examines the impact of the schooling building program
on those born before the program took effect in their province.
She finds that the increase in educated workers due to the program
reduces the wages of workers in older age cohorts by 4 to 10 percent.
By thinking through the economic theory for how an increased supply
of workers will affect the economy overall, we can find implications
for how those who do not participate in a program may be affected.
I. Statistical Review
For this course, we will assume that everyone understands basic
probability and statistics.
However, we will spend the first two or three classes reviewing these
concepts for two reasons.
First, we want to be certain that everyone has seen the same topics
presented in a similar manner before moving onto econometrics.
Second, many of the statistical concepts you have previously seen will
be applied and extended in econometrics.
By reviewing these concepts, it will be much easier to see the parallels
between what you already know and how they are applied.
1,2, . . . ,
where
11
and
where
respectively.
and ,
12
Notice that
given
is defined as
,
0.
is only defined if
given that
and
13
where
If
is the pdf of .
, or sometimes
We write
as the population mean.
, and refer to
14
15
,,
are
16
Variance
The variance measures the dispersion of a pdf.
17
Properties of
1) If c is constant, then
0.
One issue with using the variance is that its units are the
square of the units of the random variable.
For example, if the random variable X is measured in feet then
is measured in feet squared.
In some instances it is useful to work with the positive square
root of the variance which is known as the standard deviation
and is denoted as .
18
Notice that if
mean, then
is above its
is above its
19
,
,
Properties of
1) If
and
When
0.
are independent,
,
Thus,
0.
20
Correlation Coefficient
The correlation coefficient offers an advantage over the
covariance since it is on a rather intuitive scale.
,
,
Notice that
In addition, 1
1.
Whereas,
, can take on any real value,
,
allows us to scale the degree to which two variables co-vary.
+1 means X and Y are perfectly positively correlated.
-1 means X and Y are perfectly negatively correlated.
21
while if
1 and
1 and
1, then
2
1, then
22
Conditional Expectation
While the covariance and correlation treat the relationship between
X and Y symmetrically, in many instances we will be interested in
explaining a variable in terms of another variable.
For example, we may be interested in knowing whether earnings
depend upon an individuals level of education.
One set of statistics we might compute is the expected amount of
earnings for people conditional on their levels of education.
The conditional expectation for a discrete random variable
is
where takes on m different values
, ,,
|
given
23
and
,
|
|
This property is known as the law of iterated expectations.
We can first compute
over the
24
5)
| ,
0).
i.
ii.
25
2
is used to denote the fact that
The symbol ~
,
normally distributed with mean and variance .
is
26
, then
0,1 .
, then
27
variables,
is normally distributed.
Furthermore,
28
0,1
29
Let
where ,
normal random variables.
,,
30
Chisquaredistribution
ProbabilityDensity
0.6
0.5
0.4
0.3
0.2
0.1
0
0
3
3df
5
5df
10
7df
31
The F Distribution
Suppose U and V are independent chi square random variables
with n and m degrees of freedom, respectively.
A random variable of the form
is said to have an F
32
ProbabilityDensity
1.0
0.8
0.6
0.4
0.2
0.0
0
m=5,n=10
m=10,n=10
10
m=20,n=10
33
The t Distribution
Let
Let
has
which
34
ProbabilityDensity
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
3
2
Zratio
0
tratio,4df
tratio,10df
35
1
~
36
37
Sampling
In many instances, we will be interested in knowing the value
of one or more population parameters.
For example, if we want to know about the degree of income
inequality in society, we would be curious to know about the
expected value and variance of the population income
distribution.
If we have a Census, then we would be able to learn the true
characteristics of the income distribution.
However, interviewing everyone in the population is a very
costly exercise in terms of both time and money.
38
Random Sampling
Instead, we will observe a sample of the population and use
the sample to generate our best guess as to what the true
characteristics of the population distribution actually are.
Suppose that is a random variable with a probability density
function
; where is an unknown parameter.
A random sample from
; is observations,
,
that are drawn independently from the pdf
; .
,,
as
We sometimes refer to the random sample
, ,,
independent, identically distributed (i.i.d.) random variables.
39
40
, is an estimator of the
, and we
, ,,
After we collect the actual data,
compute the estimator by using the values that we measure in
the sample, the resulting value is known as an estimate.
41
,,
42
Unbiasedness
An estimator
of
is said to be unbiased if
43
Example
, we have already
0
The bias of the sample average is
means, as we have already seen, that
0 which
is unbiased.
44
1
1
1
1
1
1
45
46
Example
We can compute the sampling variance of the sample
average,
.
1
1
47
Efficiency
It is possible that we may encounter multiple estimators for
the same parameter.
Example
We have already seen that the sample average is an
unbiased estimator of the population mean, .
An alternative estimator is to only use the first
observation of the random sample, , as an estimator for
the population mean, .
Notice that this alternative estimator is also an unbiased
estimator since
48
49
50
Example
Therefore,
since
51
52
Consistency
One useful property for an estimator is that as the sample
grows infinitely large, the estimator equals the true parameter.
is an estimator of with a sample size n, then
Formally, if
is a consistent estimator of if for every
0,
lim
If
1
then we say it is inconsistent.
In addition, if
is consistent, then we say that
which is written as
probability limit of
is the
53
Notice that as
Therefore,
is unbiased for
and, in addition,
0.
is a consistent estimator of .
54
0.
is a consistent (but biased) estimator of .
55
Asymptotic Normality
In order to draw inferences, we need to know not only the
estimator, but we also need to know information about the
sampling distribution of the estimator.
Many econometric estimators are approximated by the normal
distribution as the sample size gets large.
:
1,2, be a sequence of random variables such
Let
that for all numbers z,
where
56
0,1
57
58
Example
Suppose that , , , are independent random
.
variables, each of which is distributed
,
We have already seen that
0,1
59
0,1
60
0.95
0.3
0.2
0.1
0
z_0.025
z_0.025
Z
61
0.4
0.3
0.2
0.1
0
z_0.025
z_0.025
Z
0.95
1.96
0.95
62
0
0.0013
0.0019
0.0026
0.0035
0.0047
0.0062
0.0082
0.0107
0.0139
0.0179
0.0228
0.0287
0.0359
0.01
0.0013
0.0018
0.0025
0.0034
0.0045
0.0060
0.0080
0.0104
0.0136
0.0174
0.0222
0.0281
0.0351
0.02
0.0013
0.0018
0.0024
0.0033
0.0044
0.0059
0.0078
0.0102
0.0132
0.0170
0.0217
0.0274
0.0344
0.03
0.0012
0.0017
0.0023
0.0032
0.0043
0.0057
0.0075
0.0099
0.0129
0.0166
0.0212
0.0268
0.0336
0.04
0.0012
0.0016
0.0023
0.0031
0.0041
0.0055
0.0073
0.0096
0.0125
0.0162
0.0207
0.0262
0.0329
0.05
0.0011
0.0016
0.0022
0.0030
0.0040
0.0054
0.0071
0.0094
0.0122
0.0158
0.0202
0.0256
0.0322
0.06
0.0011
0.0015
0.0021
0.0029
0.0039
0.0052
0.0069
0.0091
0.0119
0.0154
0.0197
0.0250
0.0314
0.07
0.0011
0.0015
0.0021
0.0028
0.0038
0.0051
0.0068
0.0089
0.0116
0.0150
0.0192
0.0244
0.0307
0.08
0.0010
0.0014
0.0020
0.0027
0.0037
0.0049
0.0066
0.0087
0.0113
0.0146
0.0188
0.0239
0.0301
0.09
0.0010
0.0014
0.0019
0.0026
0.0036
0.0048
0.0064
0.0084
0.0110
0.0143
0.0183
0.0233
0.0294
1.96
0.025.
63
0
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987
0.01
0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
0.9982
0.9987
0.02
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982
0.9987
0.03
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988
0.04
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988
0.05
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989
1.96
0.06
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989
0.07
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989
0.08
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.9986
0.9990
0.09
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986
0.9990
0.975.
Therefore,
1.96
1.96
1.96
0.975 0.025
0.95
1.96
64
1.96
Since
1.96
0.95
1.96
0.95
to
1.96
1.96
1.96
0.95
1.96
1.96
0.95
0.95
65
Example
The height of white females who registered to vote in
Allegheny County, PA during the 1960s is normally
distributed with a variance of 6.25 (in inches).
If a random sample of 9 women is selected and the
sample average is height
65.5, construct a 95%
confidence interval for the true average height, .
Noting that
6.25
confidence interval for is
1.96
65.5
1.96
2.5
9
63.87
1.96
2.5
9
66
67
68
0.95
69
0.95
30
0.10
3.078
1.886
1.638
1.533
1.476
1.440
1.310
1.282
0.05
6.314
2.920
2.353
2.132
2.015
1.943
0.025
12.706
4.303
3.182
2.776
2.571
2.447
0.01
31.821
6.965
4.541
3.747
3.365
3.143
0.005
63.657
9.925
5.841
4.604
4.032
3.707
1.697 2.042
2.457
2.750
1.645 1.960
2.326
2.576
70
0.95
0.95
0.95
0.95
71
Example
Returning to the height example of white females who
registered to vote in Allegheny County, PA during the
1960s where height is normally distributed.
Suppose that for the random sample of
we compute the sample average height
9 women,
65.5.
72
8 d.f.,
0.01
2.896
0.95
2.306.
0.005
3.355
65.5
2.306
8.5
9
63.26
65.5
2.306
8.5
9
67.74
73
74
75
Example
In our height example, we may formulate a null
hypothesis that the true mean height of white female
registered voters in Allegheny County is 63 inches.
We would write
:
63
:
:
63
are called one-sided alternative hypotheses.
63
76
77
9,
65.5, and
2.92.
We will test
:
63
63
78
0.95
63, is
79
67.74
80
81
if |
|
65.5
2.306.
63
2.929
2.57
82
63.
0.3
0.2
0.1
0
t_0.05
T
83
0.01
2.896
if
8 d.f.,
0.005
3.355
1.860.
1.860.
84
85
86
P-values
Our hypothesis testing proceeds by finding a critical value
and then testing whether either the sample average lies within
the confidence interval or if the t-statistic exceeds a threshold.
Another approach is to ask the question How likely is that
we would observe the sample mean, , that we find in our
sample if the population mean is really ?
In our height example, we would ask how likely is it that we
would observe
65.5 in our sample if the true population
mean is
63 (the null hypothesis).
We proceed as before when we were using the test of
significance approach.
65.5
63
2.929
2.57
87
2.57,
0.01
2.896
0.005
3.355
88
or
89
We use the test which allows for not only the sample sizes to
differ between the two samples but we also allow for the
variances in the two populations to differ.
To test this null hypothesis, suppose that we draw large
(n>30), independent samples from each population.
We make this assumption since the degrees of freedom
calculation is slightly complicated.
However, if the sample is large enough we can simply
approximate the sample distribution of the test statistic with
the standard normal distribution by application of the Central
Limit Theorem.
90
91
Notice the similarity between the t ratio for the test of equality
of two means and the t ratio that we used previously to test a
hypothesis about a population mean
92
Example
We can test whether mean height differs between male and
female voters in Allegheny County.
We draw a new sample of 36 male and 36 female voters where
the means of the sample are
summarize height sex
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------height |
72
67.29861
3.850791
59.5
75
sex |
0
93
To test the null hypothesis that the mean heights of men and
women are equal, we need to compute the means and standard
deviations separately for men and women.
summarize height if sex=="F"
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------height |
36
64.52778
2.850926
59.5
72
summarize height if sex=="M"
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------height |
36
70.06944
2.481799
64
75
94
2.85, and
6.15.
0 while our
0, we will perform a
At the
0.05 level of significance, the critical t value is
if | | 1.96.
1.96 so our decision rule is to reject
95
Women: 64.5,
Men:
70.1,
2.85, and
2.48, and
8.12
6.15
64.5
70.1
8.12
36
8.89
Since | |
| 8.89|
8.89
6.15
36
1.96, we reject
96
97