EC3303 Econometrics I
Department of Economics, NUS Spring, 2010 JongHoon Kim
EC3303 Econometrics I
Lecturer: JongHoon KIM, AS2, 04‐40 Class: Thu, 14.00 – 16.00, LT11
Contact via emails please…
Starts on Week3 (Jan 25‐29)
Tutorials: Mon – Fri, 10.00 – 12.00, AS4, 01‐17 Office Hours: Mon, 13.00 – 15.00
Assessment: Final Exam 60% + Continuous Assessment 40%
Textbook:
Problem sets 20% + Midterm test 20%
2 ‐ 3
After the Midterm break
Stock, J.H. and M.W. Watson (2006): Introduction to Econometrics, Second edition. Boston: Pearson Addison Wesley. (HB 139 Sto 2006, CL, HSSML)
Supplementary reading:
Wooldridge, J.M. (2005): Introductory Econometrics: A Modern Approach, Third edition. Gujarati, Damodar N. (2003): Basic Econometrics, Fourth edition. NY: McGrawHill. Any statistics textbook…
1. Overview  What is Econometrics? (SW Ch1)
Global warming/Chinese economy in 10 yrs time?
Global warming/Chinese economy in 10 yrs time?
Effectiveness of “caning” in SG penal system?
Effectiveness of “caning” in SG penal system?
High time to buy a car or a HDB flat?
High time to buy a car or a HDB flat?
Observed and
stylized facts
A medical case in SG 2008:
A medical case in SG 2008:
An upsurge(150 or so over 5 months) of low blood
An upsurge(150 or so over 5 months) of low blood
pressure shock cases 7 in coma, 4 death…?
pressure shock cases 7 in coma, 4 death…?
+ statistical tools/methods
Economics + Metric(Measure) = Econometrics
Definitive/quantitative questions with definitive/quantitative answers
Examples:
test score _{i}
class size _{i}
student i
meaningful effect? How large?
pure(distinguishable) effect?
cigarette consumption _{i}
cigarette sales price _{i}
price elasticity?
other factors?
reverse “causality”?
How much can Apple price‐gouge SG customers on its 4G i‐phones?
iphone sales _{i}
iphone retail price _{i}
Benefits of the Casinos/Universal Theme Park at Sentosa/Marina Bay? How many will survive EC3303 through to Final Exam?
(d) Explaining abrupt crime drop in 1990s in US (Levitt, Freakonomics…) 1. Innovative policing strategy

2. Increased reliance on prisons

3. Changes in crack and other drug markets

4. Aging of the population

5. Tough guncontrol laws

6. Strong economy

7. Increased number of police

8. All other explanations (increased used of capital punishment, gun buybacks, and etc.)
(d) Explaining abrupt crime drop in 1990s in US (Levitt, Freakonomics…) 1. Innovative policing strategy

2. Increased reliance on prisons

3. Changes in crack and other drug markets

4. Aging of the population

5. Tough guncontrol laws

6. Strong economy

7. Increased number of police

8. All other explanations
(increased used of capital punishment, gun buybacks, and etc.) Legalization of abortion 1973, US Supreme Court Ruling on Roe v. Wade
(d’) Seeking determinants of crime rates (Levitt(1996))
crime rate _{t}
incarceration rate _{t}
year t
other factors?
reverse “causality”?
(e) Understanding global warming (the effect of CO _{2} emission)

?


Vol NorPoleIce _{t}

CO _{2} emission _{t}


?


reversed “causality”?
true scale of the effect?
…“global warming hoax”?
(f) And many, many more interesting issues awaiting… “H1N1 Flu pandemic hoax(scam)”, “Renewal of the contract hosting the F1 race in SG”,
Econometrics = Economics + Metric
theory
data
Sources of Data:
(controlled) experiment observation
Typical Economic Dataset:
Individuals (person, firm,…), localities (city, states,…),…
cross‐sectional data multiple entities at a given point in time
time series data panel data
(longitudinal data)
a single entity over multiple peroids in time multiple entities over multiple peroids in time
“Devils are in the detail(s).”
“Data is the least deceiving window toward truth.”
(provided you know how to tease them without bungling)
“Why? …Why? Why?”
...
2. Review of Probabilities (SW Ch2) _{w}_{h}_{y} _{d}_{o} _{w}_{e} _{c}_{a}_{r}_{e}_{?}
2.1 Probability Space
Ω
E
.
ω
Probability Space (Sample Space), Ω
the (imaginary) collection of the whole “outcomes”
in life
Outcome, ω a specific happening(realization)
a collection of certain outcomes
a subset of Ω
(basic unit to assign probability!)
Examples: i) the event E of tossing a coin to “head” ii) the event E of finishing today’s lecture at 3.35pm sharp iii) the event E of STI index “gaining” tomorrow
Events are the subsets resulting from introducing division(“partition”) of Ω.
Here, for example, E and E ^{c} .
More than two events occurring when…
“disjoint”
E 1
E 2
E 3
…
F 1 F 2 F 3
F n
Ω
Ω
An example: Rolling a dice into
An example: … ?
{1,2}, {3,4}, {5,6}
Probalibity: A relative measure of the likelihoods of events, satisfying

(a) 0 ≤ P(E) ≤ 1,

(b) P(Ω) = 1 for any E Ω, and

(c) P(E _{1} E _{2} ) = P(E _{1} ) + P (E _{2} ) + , for disjoint E _{1} , E _{2} ,
P( _{i}_{=}_{1} E _{i} )
∑ _{i}_{=}_{1} P(E _{i} )
More than two events occurring when…
“disjoint”
E 1
E 2
E 3
…
F 1 F 2 F 3
F n
Ω
Ω
An example: Rolling a dice into
An example: … ?
{1,2}, {3,4}, {5,6}
Probalibity: A relative measure of the likelihoods of events, satisfying
E F P(E) ≤ P(F),
P(E c ) =
1 – P(E),
P(E F) = P(E) +P(F) – P(E∩F),…
E 1 E 2
,
,
P( _{i}_{=}_{1} E _{i} )
∑ _{i}_{=}_{1} P(E _{i} )
(ii) multiple (overlapping) partitions (each w/ multiple cuts)
P(E _{1} ∩F _{1} ), P(E _{1} ∩F _{2} ), P(E _{2} ∩F _{1} ), P(E _{2} ∩F _{2} )
(ii) multiple (overlapping) partitions (each w/ multiple cuts)
P(E _{1} ∩F _{1} ), P(E _{1} ∩F _{2} ), P(E _{2} ∩F _{1} ), P(E _{2} ∩F _{2} )
P(E _{1} ) and P(E _{2} ) (or likewise, P(F _{1} ), P(F _{2} ) )
(ii) multiple (overlapping) partitions (each w/ multiple cuts)
From these…
P(E _{1} ∩F _{1} ), P(E _{1} ∩F _{2} ), P(E _{2} ∩F _{1} ), P(E _{2} ∩F _{2} )
P(E _{1} ) and P(E _{2} ) (or likewise, P(F _{1} ), P(F _{2} ) )
Conditional probability
P(E 1 ∩F 1 )
P(E 1 F 1 ) =
P(F 1 )
(likewise, P(E _{2} F _{1} ), P(E _{1} F _{2} ), P(E _{2} F _{2} ))
“how likely E _{1} to happen is oblivious of F _{1} ”
Statistical Independence _{P}_{(}_{E} _{1} _{)} _{=} _{P}_{(}_{E} _{1} _{}_{F} _{1} _{)} ( P(E _{1} ∩F _{1} ) = P(E _{1} )P(F _{1} ))
Statistical Independence
bet n the two paritions
P(E _{2} ) = P(E _{2} F _{1} ), _{P}_{(}_{E} _{1} _{)} _{=} _{P}_{(}_{E} _{1} _{}_{F} _{2} _{)}_{,} P(E _{2} ) = P(E _{2} F _{2} )
In general, with much finer partitions…?
How useful is it?
An example (from a German biostatistics text):
A recently found contagious (and deadly) disease!
You were tested positive (and diagnosed as so). Am I really infected? The prob.?
The test’s known to detect

99 out of 100 true infected cases

= P(E _{1} F _{1} )

98 out of 100 true uninfected cases

= P(E _{2} F _{2} )

There is 1/1000 chance of getting infected. = P(F _{1} )
P(F _{1} E _{1} ) ?
E 1
E 2
infected
99
1
F 1
not‐infeced
1,998
97,902
F 2
If in a population of 100,000…
99+1= 100
99,900 = 1,998 + 97,902
P(E _{1} ∩F _{1} )
P(E _{1} )
_{=}
99
_{P}_{(}_{F} 1 _{}_{E} 1 _{)} _{=}
2,097
0.047
2.2 Random Variables and Probability Distributions
A “random variable”, Y “a numerical summary of a random outcome” (SW)
a collection of (possibly infinitely many) numbers, which
??
takes on(“realizes to”) one of them when a certain event happens
A “partition” a random variable

Ω
Y
if E happens
if E ^{c} happens
A “partition” a random variable
Rolling a dice n = 6
Daily SGD vs. USD n = ∞
y n
Y
with F _{1}
with F _{n}
The “probability distribution of Y,” _{P} _{Y}
the list(ing) of all probabilities attached to all possible outcomes of Y
( the wholesome of all probabilities of the events induced by Y)
( the full knowledge of Pr{a ≤ Y ≤ b} for any a,b)
Y
0
1
1 – p
p
P(E c )
P(E)
An example: Y Bernoulli(p) with p = P(E)
Discrete random variable finite(or countably many) values(“events”)
(Discrete dist n of a r.v.)
Continuous random variable uncountably infinitely many values(“events”)
(Continuous dist n of a r.v.)
Expressing/Describing a prob. distribution (of a r.v. Y) :
(i) tabulation feasible only in finite cases!
p.d.f.(probability density function)
(ii) p.m.f.(probability mass function) “pointwise probability (expression)”
(iii) c.d.f.(cumulative distribution function) _{“}_{r}_{a}_{n}_{g}_{e}_{‐}_{w}_{i}_{s}_{e} _{p}_{r}_{o}_{b}_{a}_{b}_{i}_{l}_{i}_{t}_{y} _{(}_{e}_{x}_{p}_{r}_{e}_{s}_{s}_{i}_{o}_{n}_{)}_{”}
Expressing/Describing a prob. distribution (of a r.v. Y) :
p.m.f.(probability mass function)
“pointwise probability (expression)”
For each possible value x of Y
f _{Y} (x) = Pr{Y = x}
only for discrete Y!
p.d.f.(probability density function)
“continuous version of pointwise probability ”
For each possible value x of Y
f _{Y} (x) (≠ Pr{Y = x})
The height of pdf ≠ prob. why?
μ
Expressing/Describing a prob. distribution (of a r.v. Y) :
p.m.f.(probability mass function)
“pointwise probability (expression)”
For each possible value x of Y
f _{Y} (x) = Pr{Y = x}
only for discrete Y!
p.d.f.(probability density function)
“continuous version of pointwise probability ”
For each possible value x of Y
f Y (x) (≠ Pr{Y = x})
The height of pdf ≠ prob. why?
b
Rather, Pr{a ≤ Y ≤ b} = ∫ a f Y (x)dx
a
b
μ
pdf of N(μ,1)
Spring 2010
EC3303
19
c.d.f.(cumulative distribution function)
“range‐wise probability (expression)”
1
1 – p
p

For each possible value x of Y
F _{Y} (x) = Pr{Y ≤ x}
= ∑ _{y} _{≤} _{x} f _{Y} (x) (= f _{Y} (x)+ f _{Y} (x – 1)+ )
discrete Y case
0
1
= ∫ _{–}_{∞} f _{Y} (x)dy
continuous Y case
1
less intuitive than pmf /pdf, but
more convenience b/c always well‐defined
Expectations/Moments (of a r.v. Y)
Often, we focus only on certain characteristics of the dist ^{n} P _{Y} , e.g.,
“the middle value of all Y outcomes”,
“the most likely value of Y”,
“how scattered the range of all Y propable values are”,…
μ Y
the mean of Y (= the expected value of Y)
A measure of the center(ing) (counting in “prob”) of the dist ^{n} P _{Y}
μ _{Y} =
EY = y _{1} f _{Y} (y _{1} ) + y _{2} f _{Y} (y _{2} ) + + y _{k} f _{Y} (y _{k} )
prob.’s as proper weights
discrete Y with k outcomes
∞
= ∑ y yf Y (y) ( ∫ –∞ yf Y (y)dy continuous version)
p
1 – p
0
1
μ
Spring 2010
EC3303
21
Expectations/Moments (of a r.v. Y)
Often, we focus only on certain characteristics of the dist ^{n} P _{Y} , e.g.,
“the middle value of all Y outcomes”,
“the most likely value of Y”,
“how scattered the range of all Y propable values are”,…
μ Y
the mean of Y (= the expected value of Y)
A measure of the center(ing) (counting in “prob”) of the dist ^{n} P _{Y}
μ _{Y} =
EY = y _{1} f _{Y} (y _{1} ) + y _{2} f _{Y} (y _{2} ) + + y _{k} f _{Y} (y _{k} )
discrete Y with k outcomes
prob.’s as proper weights
_{=} _{∑} _{y} _{y}_{f} _{Y} _{(}_{y}_{)} _{(} _{} ∫ _{–}_{∞} yf _{Y} (y)dy continuous version)
σ² _{Y}
the variance of Y (= the expected value of the“squareddeviations of Y from μ _{Y} ”)
A measure of the dispersion (counting in “prob”) of the dist ^{n} P _{Y}
σ² _{Y} =
Var(Y) = (y _{1} – μ _{Y} ) ^{2} f _{Y} (y _{1} ) + + (y _{k} – μ _{Y} ) ^{2} f _{Y} (y _{k} ) _{d}_{i}_{s}_{c}_{r}_{e}_{t}_{e} _{Y} _{w}_{i}_{t}_{h} _{k} _{o}_{u}_{t}_{c}_{o}_{m}_{e}_{s}
_{=} _{∑} _{y} _{(}_{y} _{–} _{μ} _{Y} _{)} ^{2} _{f} _{Y} _{(}_{y}_{)} ( ∫ _{–}_{∞} (y – μ _{Y} ) ^{2} f _{Y} (y)dy continuous version)
= E(Y– μ _{Y} ) ^{2}
μ Y
Expectations/Moments (of a r.v. Y)
the mean of Y (= the expected value of Y)
A measure of the center(ing) (counting in “prob”) of the dist ^{n} P _{Y}
μ _{Y} =
EY = y _{1} f _{Y} (y _{1} ) + y _{2} f _{Y} (y _{2} ) + + y _{k} f _{Y} (y _{k} )
discrete Y with k outcomes
prob.’s as proper weights
_{=} _{∑} _{y} _{y}_{f} _{Y} _{(}_{y}_{)} _{(} _{} ∫ _{–}_{∞} yf _{Y} (y)dy continuous version)
σ² _{Y}
the variance of Y (= the expected value of the“squareddeviations of Y from μ _{Y} ”)
discrete Y with k outcomes
∞
continuous version)
A measure of the dispersion (counting in “prob”) of the dist ^{n} P _{Y}
Var(Y) = (y _{1} – μ _{Y} ) ^{2} f _{Y} (y _{1} ) + + (y _{k} – μ _{Y} ) ^{2} f _{Y} (y _{k} )
= ∑ _{y} (y – μ _{Y} ) ^{2} f _{Y} (y) ( ∫ _{–}_{∞} (y – μ _{Y} ) ^{2} f _{Y} (y)dy
= E(Y– μ _{Y} ) ^{2}
σ² _{Y} =
σ Y =
√Var(Y)
the standard deviation of Y
μ
Recall! (Important properties of μ _{Y} and σ² _{Y} ): _{G}_{i}_{v}_{e}_{n} _{a} _{r}_{.}_{v}_{.} _{Y} _{w}_{i}_{t}_{h} _{μ} _{Y} _{a}_{n}_{d} _{σ}_{²} _{Y}
If X = aY + b for any nonrandom numbers a, b,
(a) EX = E(aY + b) = aEY + b = aμ _{Y} + b
(b) Var(X) = Var(aY + b) = a² Var(Y) = a²σ² _{Y}
Other useful (higher) moments (of a r.v. Y)
the skewness of Y A measure of the asymmetry(inclination) of the dist ^{n} P _{Y}
E(Y– μ _{Y} ) ^{3}

σ ^{3}


E(Y– μ _{Y} ) ^{4}

the kurtosis of Y A measure of the tail thinckness of the dist ^{n} P _{Y}
σ ^{4}
Recall! (Important properties of μ _{Y} and σ² _{Y} ): _{G}_{i}_{v}_{e}_{n} _{a} _{r}_{.}_{v}_{.} _{Y} _{w}_{i}_{t}_{h} _{μ} _{Y} _{a}_{n}_{d} _{σ}_{²} _{Y}
If X = aY + b for any nonrandom numbers a, b,
(a) EX = E(aY + b) = aEY + b = aμ _{Y} + b
(b) Var(X) = Var(aY + b) = a² Var(Y) = a²σ² _{Y}
Other useful (higher) moments (of a r.v. Y)
the skewness of Y A measure of the asymmetry(inclination) of the dist ^{n} P _{Y}
E(Y– μ _{Y} ) ^{3}

σ ^{3}


E(Y– μ _{Y} ) ^{4}

the kurtosis of Y A measure of the tail thinckness of the dist ^{n} P _{Y}
σ ^{4}
2.3 Multiple RandomVariables (“More than one r.v.?”)
Remember! A “random variable”, Y
A “partition” a random variable

Ω
Y
if E happens
if E ^{c} happens
A “partition” a random variable
y 1
with F _{1}
Rolling a dice n = 6
Y
Daily SGD vs. USD n = ∞
y n
with F _{n}