Professional Documents
Culture Documents
1
Overview
3
Textbook
➢ Joshua D. Angrist, Jörn-Steffen Pischke:Mostly
Harmless Econometrics: An Empiricist's Companion
4
Reference Book
➢ Introductory:
1. Peter Kennedy:A Guide to Econometrics (Baby book)
2. Stock and Watson:Introduction to Econometrics
3. Michael P. Murray: Econometrics: A Modern Introduction
4. Philip H. Franses: Enjoyable Econometrics
5. Joshua D. Angrist, Jörn-Steffen Pischke:Mastering 'Metrics: The Path
from Cause to Effect
➢ Intermedia:
1. Jeffrey Wooldridge: Introductory Econometrics: A Modern Approach
➢ Advanced :
1. Jeffrey Wooldridge: Econometric Analysis of Cross Section and Panel
Data
2. Bruce Hansen: Econometrics
5
Syllabus--Grading
➢ Class Participation and Homework (15%), Data
Visualization Exercise (15%), Two Reading
Reports (20%), Final Exam (50%).
➢ All assignments and exams should be submitted
on time. No points for late work. Of course, if
there is a verifiable medical reason and
arrangements are made before the exam,
adjustment might be made. Please take the
academic integrity seriously.
6
What is Econometrics: Examples
7
Quantitative Features of Modern Economics
➢ Features:
-mathematical modeling for economic theory
-empirical analysis for economic phenomena.
➢ General methodology of modern economic research:
1. Data collection and summary of empirical stylized facts.
2. Development of economic theories/models
3. Empirical verification of economic models.
4. Applications:to test economic theory or hypotheses, to
forecast future evolution of the economy, and to make
policy recommendations.
8
What is Econometrics: Definition
Frisch (1933):
Econometrics is by no means the same as economic
statistics. Not is it identical with what we call general
economic theory, although a considerable portion of this
theory has a definitely quantitative character. Nor should
econometrics be taken as synonymous with the application
of mathematics to economics. Experience has shown that
each of these three viewpoints, that of statistics, economic
theory, and mathematics, is a necessary, but not by itself a
sufficient condition for a real understanding of the
quantitative relations in modern economic life. It is the
unification of all three that is powerful. And it is this
unification that constitutes econometrics.
9
Limitation of Econometrics
➢ Econometrics is the analysis of the "average behavior" of
a large number of realizations. However, economic data
are not produced by a large number of repeated random
experiments, due to the fact that an economy is not a
controlled experiment:
1. Economic theory or model can only capture the main or
most important factors。
2. An economy is an irreversible or non-repeatable system.
3. Economic relationships are often changing over time for an
economy.
4. Data quality
10
Background for Learning this Course
11
Background for Learning this Course
APPENDIX B: Probability and Distribution Theory
➢ Random Variables
➢ Expectations of a Random Variable
➢ Some Specific Probability Distributions
➢ The Distribution of a Function of a Random Variable
➢ Representations of a Probability Distribution
➢ Joint Distributions
➢ Conditioning in a Bivariate Distribution
➢ The Bivariate Normal Distribution
➢ Multivariate Distributions
➢ Moments
➢ The Multivariate Normal Distribution
12
Background for Learning this Course
13
Causal Effects and Idealized Experiments
14
Correlation or Causation?
(By Vali Chandrasekaran)
http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html
15
Correlation or Causation?
(By Vali Chandrasekaran)
http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html
16
Correlation or Causation?
(By Vali Chandrasekaran)
http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html
17
Extended Reading
18
Extended Reading
19
Data: Source and Types
20
Data: Source and Types
➢ There are several different kinds of economic data
sets:
➢ Cross-sectional data
➢ Time series data
➢ Pooled cross sections
➢ Panel/Longitudinal data
➢ Econometric methods depend on the nature of the
data used. Use of inappropriate methods may lead
to misleading results.
21
Types of Data – Cross-sectional Data
➢ Cross-sectional data is a random sample
➢ The fact that the ordering of the data does not matter for
econometric analysis is a key feature of cross-sectional
data sets obtained from random sampling. 22
23
Types of Data – Time Series
➢ This includes observations of a variable or several variables
over time.
➢ Typical applications include applied macroeconomics and
finance. Examples include stock prices, money supply,
consumer price index, gross domestic product, annual
homicide rates, automobile sales, and so on.
➢ Time series observations are typically serially correlated.
➢ Ordering of observations conveys important information.
➢ Data frequency may include daily, weekly, monthly,
quarterly, annually, and so on.
➢ Typical features of time series include trends and
seasonality. 24
25
Types of Data – Pooled Cross Sections
➢ Two or more cross sections are combined in one data set.
➢ Cross sections are drawn independently of each other.
➢ Pooled cross sections are often used to evaluate policy
changes.
➢ Example: Evaluating effect of change in property taxes on
house prices:
--Random sample of house prices for the year 1993.
--A new random sample of house prices for the year 1995.
--Compare before/after (1993: before reform, 1995: after
reform).
26
This is NOT a panel data set!
27
Types of Data – Panel or Longitudinal Data
➢ The same cross-sectional units are followed over time.
➢ Panel data have a cross-sectional and a time series
dimension.
➢ Panel data can be used to account for time-invariant
unobservables.
➢ Panel data can be used to model lagged responses.
➢ Example: City crime statistics; each city is observed in two
years.
--Time-invariant unobserved city characteristics may be modeled.
-- Effect of police on crime rates may exhibit time lag.
28
This IS a panel data set!
29
30
Classical Linear Regression Model
31
Linear Regression Model
y = b0 + b1x + 𝜀
y = b0 + b1x1 + b2x2+…+ bkxk + 𝜀
32
A Simple Example: Reed Auto Sales
33
Example: Reed Auto Sales
Sample?
Observation?
Dependent Variable?
Independent Variable?
Model?
34
Example: Reed Auto Sales
➢Scatter Diagram
30
25
20
Cars Sold
15
10
5
0
0 1 2 3 4
TV Ads
35
36
37
Question: Do districts with smaller classes (lower STR) have
higher test scores?
Test score
STR
38
39
Estimation Process
Regression Model Sample Data:
y = b0 + b1x +e x y
Regression Equation x 1 y1
E(y) = b0 + b1x . .
Unknown Parameters . .
b0, b1 x n yn
Estimated
b0 and b1 Regression Equation
provide estimates of yˆ = b0 + b 1 x
b0 and b1 Sample Statistics
b0, b1
40
Matrix Form: Y=X𝛽+𝜀
43
Assumptions of
the Classical Linear Regression Model
➢ A2. Full rank: There is no exact linear relationship
among any of the independent variables in the model.
44
Assumptions of
the Classical Linear Regression Model
➢ A3. Exogeneity of the independent variables
𝐸 𝜀𝑋 =0
45
Assumptions of
the Classical Linear Regression Model
➢ A4. Homoscedasticity and nonautocorrelation
𝑉𝑎𝑟 𝜀 𝑋 = 𝜎 2 𝐼
46
Assumptions of
the Classical Linear Regression Model
➢ A5 Data generation
47
Assumptions of
the Classical Linear Regression Model
➢ A6 Normal distribution: The disturbances are
normally distributed
48
Classical Linear Regression Model
49
3.2 Least Squares Regression
( )
n n
(ui ) = yi − b 0 − b1 xi
2
ˆ 2 ˆ ˆ
i =1 i =1
50
Deriving OLS Estimates: Least Square
51
Estimating the Coefficients
n n
➢ Object: Min (Y − Yˆ ) 2 = (Y − b − b x ) 2
i i i 0 1 i
i =1 i =1
( )
n
n −1 yi − bˆ0 − bˆ1 xi = 0
➢ Take derivative: i =1
( )
n
n −1 xi yi − bˆ0 − bˆ1 xi = 0
i =1
bˆ1 = ( x − x )( y
i i − y)
( xi − x ) 2
52
Example: Reed Auto Sales
➢Regression equation
30
25
20
Cars Sold
^
y = 10 + 5x
15
10
5
0
0 1 2 3 4
TV Ads
53
Matrix Form
Def b = arc min(Y − Xb0 )' (Y − Xb0 )
b0
= arc min(Y 'Y − b0' X 'Y − Y ' Xb0 + b0 ' X ' Xb0 )
b0
scalar scalar
= arc min(Y 'Y − 2Y ' Xb0 + b0' X ' Xb0 )
b0
Q
FOC = −2 X 'Y + 2 X ' Xb0 = 0
b0
b = ( X X ) X 'Y
^ −1
'
☆LS estimator
54
Projection
^ ^
Y = X b (Fitted value)
= X ( X X ) X 'Y
' −1
^ ^
e = Y − Y = Y − X b (Residual,Note: e = Y − X b is error)
= Y − X ( X X ) X 'Y
' −1
= I − X ( X X ) X ' Y
' −1
55
Projection
( ) M = I − X (X X ) X'
−1 −1
Define P = X X X
' ' '
X
(Projection Matrix) = I −P (Residual maker)
P , M are symmetric, idempotent
1 1 1
1
Note:if X = (1 1 1) M = I −
0
n
1 1 1 mn
56
Projection
① 𝐏𝐌 = 𝐌𝐏 = 𝟎
② 𝑷𝑿 = 𝑿 , 𝑴𝑿 = 𝟎 , 𝑿𝒆 = 𝑿 ∗ 𝑴𝒀 =
𝟎
③ 𝒀 = 𝑷𝒀 + 𝑴𝒀 = 𝑿𝒃 + 𝒆
𝒀
④ 𝐘 ′ 𝐘 = 𝐘 ′ 𝐏′ 𝐏𝐘 + 𝒀′ 𝑴′ 𝑴𝒀 = 𝒀′ + 𝒆′ 𝒆
57
3.3 Partitioned Regression and Partial Regression
Suppose that the regression involves two sets of variables X1
and X2 Then y = Xb+ e = X1b1+ X2b2+ e
The normal equations are
(𝐗 ′ 𝐗)𝐛 = 𝐗 ′ 𝐲
or:
𝐗 ′𝟏 𝐗 𝟏 𝐗 ′𝟏 𝐗 𝟐 𝐛𝟏 𝐗 ′𝟏 𝐲
[ ′ ′ ]( ) = [ ′ ]
𝐗 𝟐 𝐗 𝟏 𝐗 𝟐 𝐗 𝟐 𝐛𝟐 𝐗𝟐𝐲
Then:
𝐗 ′𝟏 𝐗 𝟏 𝐛𝟏 + 𝐗 ′𝟏 𝐗 𝟐 𝐛𝟐 = 𝐗 ′𝟏 𝐲(1)
𝐗 ′𝟐 𝐗 𝟏 𝐛𝟏 + 𝐗 ′𝟐 𝐗 𝟐 𝐛𝟐 = 𝐗 ′𝟐 𝐲(2)
58
3.3 Partitioned Regression and Partial Regression
From (2):
b2 = (X2X2)-1X2(y - X1b1)
Similarly, b1 = (X1X1)-1X1(y – X2b2) (3-18)
61
3.4 Partial Regression and Partial Correlation Coefficients
62
3.4 Partial Regression and Partial Correlation Coefficients
63
3.5 Goodness of Fit and the Analysis of Variance
𝐒𝐒𝐓 = σ𝐧𝐢=𝟏(𝐘𝐢 − 𝐘
ഥ)𝟐 , SSR = σ𝐧𝐢=𝟏(𝐘 ഥ)𝟐 , SSE = σ𝐧𝐢=𝟏(𝐘𝐢 − 𝐘
𝐢 − 𝐘 𝐢 )𝟐
𝐒𝐒𝐓 = 𝐒𝐒𝐑 + 𝐒𝐒𝐄
64
Analysis of Variance
𝐧
ത 𝟐 = 𝐘′𝐌𝟎 𝐘
𝐒𝐒𝐓 = (𝐘𝐢 − 𝐘)
𝐢=𝟏
= 𝐘 ′ 𝐌𝟎 𝐗𝐛 + 𝐘 ′ 𝐌𝟎 𝐞
= 𝐗𝐛 + 𝐞 ′ 𝐌𝟎 𝐗𝐛 + 𝐗𝐛 + 𝐞 ′ 𝐌𝟎 𝐞
= 𝐛′ 𝐗 ′ 𝐌𝟎 𝐗𝐛 + 𝐞′ 𝐌𝟎 𝐗𝐛 + 𝐛′ 𝐗 ′ 𝐌𝟎 𝐞 + 𝐞′ 𝐌𝟎 𝐞
= 𝐛′ 𝐗 ′ 𝐌𝟎 𝐗𝐛 + 𝐞′ 𝐌𝟎 𝐞
= σ𝐧𝐢=𝟏(𝐘𝐢 − 𝐘)
ത 𝟐 + σ𝐧𝐢=𝟏(𝐘𝐢 − 𝐘𝐢 )𝟐
= SSR + SSE
65
Goodness of Fit
SSR SSE
R =2
= 1−
SST SST
R2 is bounded by zero and one only if:
(a) There is a constant term in X and
(b) The line is computed by linear least squares.
68
69
3.6 Linearly Transformed Regression
➢ Def Z = XP for KK nonsingular P as a linear
transformation, how does transformation affect the results
of least squares?
➢ Transformation does affect the “estimates
Based on X, b = (XX)-1X’y.
Based on Z, c = (ZZ)-1Z’y = (P’XXP)-1P’X’y
= P-1(X’X)-1P’-1P’X’y = P-1b
➢ Transformation does not affect the fit of a model to a body of
data
➢ “Fitted value” is Zc = (XP)(P-1b) = Xb. The same!!
➢ Residuals from using Z are y - Zc = y - Xb (we just proved this.).
The same!!
➢ Sum of squared residuals must be identical, as y-Xb = e = y-Zc.
➢ R2 must also be identical, as R2 = 1 - ee/y’M0y (!!).
70