You are on page 1of 26

EC3303 Econometrics I

Department of Economics, NUS Spring, 2010 JongHoon Kim

EC3303 Econometrics I

Lecturer: JongHoon KIM, AS2, 0440 Class: Thu, 14.00 – 16.00, LT11

Contact via emails please…

EC3303 Econometrics I Lecturer: JongHoon KIM, AS2, 04 ‐ 40 Class: Thu, 14.00 – 16.00, LT11

Starts on Week3 (Jan 2529)

EC3303 Econometrics I Lecturer: JongHoon KIM, AS2, 04 ‐ 40 Class: Thu, 14.00 – 16.00, LT11

Tutorials: Mon – Fri, 10.00 – 12.00, AS4, 0117 Office Hours: Mon, 13.00 – 15.00

Assessment: Final Exam 60% + Continuous Assessment 40%

Textbook:

Problem sets 20% + Midterm test 20%

2 3

After the Midterm break

Stock, J.H. and M.W. Watson (2006): Introduction to Econometrics, Second edition. Boston: Pearson Addison Wesley. (HB 139 Sto 2006, CL, HSSML)

Supplementary reading:

Wooldridge, J.M. (2005): Introductory Econometrics: A Modern Approach, Third edition. Gujarati, Damodar N. (2003): Basic Econometrics, Fourth edition. NY: McGraw-Hill. Any statistics textbook…

Chapter I. Introduction

1. Overview - What is Econometrics? (S-W Ch1)

Reasoning and conjecture
Reasoning
and
conjecture

Global warming/Chinese economy in 10 yrs time?

Global warming/Chinese economy in 10 yrs time?

Effectiveness of “caning” in SG penal system?

Effectiveness of “caning” in SG penal system?

High time to buy a car or a HDB flat?

High time to buy a car or a HDB flat?

Observed and stylized facts
Observed and
stylized facts

A medical case in SG 2008:

A medical case in SG 2008:

An upsurge(150 or so over 5 months) of low blood

An upsurge(150 or so over 5 months) of low blood

pressure shock cases 7 in coma, 4 death…?

pressure shock cases 7 in coma, 4 death…?

data
data
+ statistical tools/methods
+ statistical tools/methods
Theory (Model)
Theory (Model)
Chapter I. Introduction 1. Overview - What is Econometrics? (S-W Ch1) Reasoning and conjecture Global warming/Chinese
Chapter I. Introduction 1. Overview - What is Econometrics? (S-W Ch1) Reasoning and conjecture Global warming/Chinese
Chapter I. Introduction 1. Overview - What is Econometrics? (S-W Ch1) Reasoning and conjecture Global warming/Chinese
Chapter I. Introduction 1. Overview - What is Econometrics? (S-W Ch1) Reasoning and conjecture Global warming/Chinese

Economics + Metric(Measure) = Econometrics

Chapter I. Introduction 1. Overview - What is Econometrics? (S-W Ch1) Reasoning and conjecture Global warming/Chinese

Definitive/quantitative questions with definitive/quantitative answers

Examples:

  • (a) Effect of reducing class size on elementary school education

Examples : (a) Effect of reducing class size on elementary school education test score class size

test score i

Examples : (a) Effect of reducing class size on elementary school education test score class size

class size i

student i

meaningful effect? How large?

pure(distinguishable) effect?

  • (b) Effect of cigarette taxes on reducing smoking

Examples : (a) Effect of reducing class size on elementary school education test score class size

cigarette consumption i

cigarette sales price i

Examples : (a) Effect of reducing class size on elementary school education test score class size

price elasticity?

other factors?

reverse “causality”?

How much can Apple pricegouge SG customers on its 4G iphones?

Examples : (a) Effect of reducing class size on elementary school education test score class size

i-phone sales i

i-phone retail price i

  • (c) Forecasting future inflation rates – SG’s inflation rate 2010 ?

Benefits of the Casinos/Universal Theme Park at Sentosa/Marina Bay? How many will survive EC3303 through to Final Exam?

(d) Explaining abrupt crime drop in 1990s in US (Levitt, Freakonomics…) 1. Innovative policing strategy

  • 2. Increased reliance on prisons

  • 3. Changes in crack and other drug markets

  • 4. Aging of the population

  • 5. Tough gun-control laws

  • 6. Strong economy

  • 7. Increased number of police

  • 8. All other explanations (increased used of capital punishment, gun buybacks, and etc.)

(d) Explaining abrupt crime drop in 1990s in US (Levitt, Freakonomics…) 1. Innovative policing strategy

  • 2. Increased reliance on prisons

  • 3. Changes in crack and other drug markets

  • 4. Aging of the population

  • 5. Tough gun-control laws

  • 6. Strong economy

  • 7. Increased number of police

  • 8. All other explanations

(increased used of capital punishment, gun buybacks, and etc.) Legalization of abortion -1973, US Supreme Court Ruling on Roe v. Wade

(d’) Seeking determinants of crime rates (Levitt(1996))

(d) Explaining abrupt crime drop in 1990s in US ( Levitt, Freakonomics… ) 1. Innovative policing

crime rate t

(d) Explaining abrupt crime drop in 1990s in US ( Levitt, Freakonomics… ) 1. Innovative policing

incarceration rate t

year t

other factors?

reverse “causality”?

(e) Understanding global warming (the effect of CO 2 emission)

 

?

? Vol NorPoleIce CO emission ?

Vol NorPoleIce t

CO 2 emission t

 

?

?

reversed “causality”?

true scale of the effect?

…“global warming hoax”?

(f) And many, many more interesting issues awaiting… H1N1 Flu pandemic hoax(scam)”, “Renewal of the contract hosting the F1 race in SG”,

Econometrics = Economics + Metric theory data
Econometrics = Economics + Metric
theory
data

Sources of Data:

(controlled) experiment observation

Typical Economic Dataset:

Individuals (person, firm,…), localities (city, states,…),…

Econometrics = Economics + Metric theory data Sources of Data : (controlled) experiment observation Typical Economic

crosssectional data multiple entities at a given point in time

time series data panel data

(longitudinal data)

a single entity over multiple peroids in time multiple entities over multiple peroids in time

“Devils are in the detail(s).”

“Data is the least deceiving window toward truth.”

(provided you know how to tease them without bungling)

“Why? …Why? Why?”

...

2. Review of Probabilities (S-W Ch2) why do we care?

2.1 Probability Space Ω E . ω
2.1 Probability Space
Ω
E
.
ω
Probability Space (Sample Space), Ω
Probability Space (Sample Space), Ω

the (imaginary) collection of the whole “outcomes”

in life

2. Review of Probabilities (S-W Ch2) 2.1 Probability Space Ω E . ω Probability Space (Sample

Outcome, ω a specific happening(realization)

Event, E
Event, E

a collection of certain outcomes

a subset of Ω

(basic unit to assign probability!)

Examples: i) the event E of tossing a coin to “head” ii) the event E of finishing today’s lecture at 3.35pm sharp iii) the event E of STI index “gaining” tomorrow

Events are the subsets resulting from introducing division(“partition”) of Ω.

Here, for example, E and E c .

More than two events occurring when…

“disjoint”

  • (i) a partition w/ multiple cuts (into mutually exclusive events)

E 1 E 2 E 3 … F 1 F 2 F 3 F n Ω
E 1
E 2
E 3
F 1 F 2 F 3
F n
Ω
Ω
An example: Rolling a dice into
An example: … ?

{1,2}, {3,4}, {5,6}

Probalibity: A relative measure of the likelihoods of events, satisfying

  • (a) 0 P(E) 1,

  • (b) P(Ω) = 1 for any E Ω, and

  • (c) P(E 1 E 2 ) = P(E 1 ) + P (E 2 ) + , for disjoint E 1 , E 2 ,

Spring 2010

More than two events occurring when… “disjoint” ( i ) a partition w/ multiple cuts (into

P( i=1 E i )

More than two events occurring when… “disjoint” ( i ) a partition w/ multiple cuts (into

i=1 P(E i )

EC3303

10

More than two events occurring when…

“disjoint”

  • (i) a partition w/ multiple cuts (into mutually exclusive events)

E 1 E 2 E 3 … F 1 F 2 F 3 F n Ω
E 1
E 2
E 3
F 1 F 2 F 3
F n
Ω
Ω
An example: Rolling a dice into
An example: … ?

{1,2}, {3,4}, {5,6}

More than two events occurring when… “disjoint” ( i ) a partition w/ multiple cuts (into

Probalibity: A relative measure of the likelihoods of events, satisfying

  • (a) 0 P(E) 1,

  • (b) P(Ω) = 1 for any E Ω, and

E F P(E) ≤ P(F), P(E c ) = 1 – P(E), P(E F) = P(E)
E F P(E) ≤ P(F),
P(E c ) =
1 – P(E),
P(E F) = P(E) +P(F) – P(E∩F),…
E 1 E 2
,
,

11

  • (c) P(E 1 E 2 ) = P(E 1 ) + P (E 2 ) + , for disjoint

More than two events occurring when… “disjoint” ( i ) a partition w/ multiple cuts (into

P( i=1 E i )

More than two events occurring when… “disjoint” ( i ) a partition w/ multiple cuts (into

i=1 P(E i )

Spring 2010

EC3303

(ii) multiple (overlapping) partitions (each w/ multiple cuts)

E 1 E 2 F 1 Ω F 2
E 1
E 2
F 1
Ω
F 2
Joint probability
Joint probability

P(E 1 F 1 ), P(E 1 F 2 ), P(E 2 F 1 ), P(E 2 F 2 )

(ii) multiple (overlapping) partitions (each w/ multiple cuts)

E 1 E 2 F 1 Ω F 2
E 1
E 2
F 1
Ω
F 2
Joint probability
Joint probability

P(E 1 F 1 ), P(E 1 F 2 ), P(E 2 F 1 ), P(E 2 F 2 )

Marginal probability
Marginal probability

P(E 1 ) and P(E 2 ) (or likewise, P(F 1 ), P(F 2 ) )

(ii) multiple (overlapping) partitions (each w/ multiple cuts)

E 1 E 2 F 1 Ω F 2
E 1
E 2
F 1
Ω
F 2

From these…

Joint probability
Joint probability

P(E 1 F 1 ), P(E 1 F 2 ), P(E 2 F 1 ), P(E 2 F 2 )

Marginal probability
Marginal probability

P(E 1 ) and P(E 2 ) (or likewise, P(F 1 ), P(F 2 ) )

Conditional probability P(E 1 ∩F 1 ) P(E 1 |F 1 ) = P(F 1 )
Conditional probability
P(E 1 ∩F 1 )
P(E 1 |F 1 ) =
P(F 1 )

(likewise, P(E 2 |F 1 ), P(E 1 |F 2 ), P(E 2 |F 2 ))

“how likely E 1 to happen is oblivious of F 1

bet n E 1 and F 1
bet n E 1 and F 1

Statistical Independence P(E 1 ) = P(E 1 |F 1 ) ( P(E 1 F 1 ) = P(E 1 )P(F 1 ))

Statistical Independence bet n the two paritions
Statistical Independence
bet n the two paritions

P(E 2 ) = P(E 2 |F 1 ), P(E 1 ) = P(E 1 |F 2 ), P(E 2 ) = P(E 2 |F 2 )

In general, with much finer partitions…?

How useful is it?

An example (from a German biostatistics text):

A recently found contagious (and deadly) disease!

You were tested positive (and diagnosed as so). Am I really infected? The prob.?

The test’s known to detect

99 out of 100 true infected cases

= P(E 1 |F 1 )

98 out of 100 true uninfected cases

= P(E 2 |F 2 )

How useful is it? An example (from a German biostatistics text) : A recently found contagious

There is 1/1000 chance of getting infected. = P(F 1 )

tested positive

tested negative

P(F 1 |E 1 ) ?

E 1 E 2 infected 99 1 F 1 not‐infeced 1,998 97,902 F 2
E 1
E 2
infected
99
1
F 1
not‐infeced
1,998
97,902
F 2

If in a population of 100,000…

99+1= 100

Ω
Ω

99,900 = 1,998 + 97,902

P(E 1 F 1 )

P(E 1 )

=

99

P(F 1 |E 1 ) =

2,097

0.047

2.2 Random Variables and Probability Distributions

2.2 Random Variables and Probability Distributions A “random variable”, Y “a numerical summary of a random

A “random variable”, Y “a numerical summary of a random outcome” (S-W)

a collection of (possibly infinitely many) numbers, which

??

takes on(“realizes to”) one of them when a certain event happens

A “partition” a random variable

E E c
E
E c
  • Ω

Y

2.2 Random Variables and Probability Distributions A “random variable”, Y “a numerical summary of a random
  • 1 0

Y Bernoulli(p)

where p = P(E)
where p = P(E)

if E happens

if E c happens

A “partition” a random variable

… F 1 F 2 F 3 F n
F 1 F 2 F 3
F n


Rolling a dice n = 6

Daily SGD vs. USD n =

y n

Y

… … y 1
y 1

with F 1

EC3303

with F n

Spring 2010

16

The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes
The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes

The “probability distribution of Y,” P Y

the list(ing) of all probabilities attached to all possible outcomes of Y

( the wholesome of all probabilities of the events induced by Y)

( the full knowledge of Pr{a Y b} for any a,b)

Y 0 1 1 – p p P(E c ) P(E)
Y
0
1
1 – p
p
P(E c )
P(E)

An example: Y Bernoulli(p) with p = P(E)

The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes
The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes

Discrete random variable finite(or countably many) values(“events”)

(Discrete dist n of a r.v.)
(Discrete dist n of a r.v.)
The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes

Continuous random variable uncountably infinitely many values(“events”)

(Continuous dist n of a r.v.)
(Continuous dist n of a r.v.)
Expressing/Describing a prob. distribution (of a r.v. Y) :
Expressing/Describing a prob. distribution (of a r.v. Y) :

(i) tabulation feasible only in finite cases!

The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes
p.d.f.(probability density function)
p.d.f.(probability density function)

(ii) p.m.f.(probability mass function) “pointwise probability (expression)”

The “probability distribution of Y ,” the list(ing) of all probabilities attached to all possible outcomes

(iii) c.d.f.(cumulative distribution function) rangewise probability (expression)

Expressing/Describing a prob. distribution (of a r.v. Y) :
Expressing/Describing a prob. distribution (of a r.v. Y) :
p.m.f.(probability mass function)
p.m.f.(probability mass function)

“pointwise probability (expression)”

For each possible value x of Y

f Y (x) = Pr{Y = x}

only for discrete Y!

p.d.f.(probability density function)
p.d.f.(probability density function)
p 1 – p 0 1
p
1 – p
0
1
pmf of Bernoulli(p)
pmf of Bernoulli(p)

“continuous version of pointwise probability

Expressing/Describing a prob. distribution (of a r.v. Y) : p.m.f.(probability mass function) “pointwise probability ( expression

For each possible value x of Y

f Y (x) (Pr{Y = x})

The height of pdf prob. why?

Spring 2010

EC3303

μ

pdf of N(μ,1)
pdf of N(μ,1)

18

Expressing/Describing a prob. distribution (of a r.v. Y) :
Expressing/Describing a prob. distribution (of a r.v. Y) :
p.m.f.(probability mass function)
p.m.f.(probability mass function)

“pointwise probability (expression)”

For each possible value x of Y

f Y (x) = Pr{Y = x}

only for discrete Y!

p.d.f.(probability density function)
p.d.f.(probability density function)
p 1 – p 0 1
p
1 – p
0
1
pmf of Bernoulli(p)
pmf of Bernoulli(p)
“continuous version of pointwise probability ” For each possible value x of Y f Y (x)
“continuous version of pointwise probability ”
For each possible value x of Y
f Y (x) (≠ Pr{Y = x})
The height of pdf ≠ prob. why?
b
Rather, Pr{a ≤ Y ≤ b} = ∫ a f Y (x)dx
a
b
μ
pdf of N(μ,1)
Spring 2010
EC3303
19

c.d.f.(cumulative distribution function)

“rangewise probability (expression)”

1

c.d.f. ( cumulative distribution function ) “range ‐ wise probability ( expression )” 1 1 –

1 – p

c.d.f. ( cumulative distribution function ) “range ‐ wise probability ( expression )” 1 1 –
|
|

p

c.d.f. ( cumulative distribution function ) “range ‐ wise probability ( expression )” 1 1 –

|

For each possible value x of Y

F Y (x) = Pr{Y x}

= y x f Y (x) (= f Y (x)+ f Y (x – 1)+ )

discrete Y case

0

1

x

= f Y (x)dy

continuous Y case

1

c.d.f. ( cumulative distribution function ) “range ‐ wise probability ( expression )” 1 1 –

less intuitive than pmf /pdf, but

more convenience b/c always welldefined

c.d.f. ( cumulative distribution function ) “range ‐ wise probability ( expression )” 1 1 –
Expectations/Moments (of a r.v. Y)
Expectations/Moments (of a r.v. Y)

Often, we focus only on certain characteristics of the dist n P Y , e.g.,

“the middle value of all Y outcomes”,

“the most likely value of Y”,

“how scattered the range of all Y propable values are”,…

μ Y

Expectations/Moments (of a r.v. Y) Often, we focus only on certain characteristics of the dist P

the mean of Y (= the expected value of Y)

A measure of the center(ing) (counting in “prob”) of the dist n P Y

μ Y =

EY = y 1 f Y (y 1 ) + y 2 f Y (y 2 ) + + y k f Y (y k )

prob.’s as proper weights

discrete Y with k outcomes

∞ = ∑ y yf Y (y) ( ∫ –∞ yf Y (y)dy continuous version) p
= ∑ y yf Y (y) ( ∫ –∞ yf Y (y)dy continuous version)
p
1 – p
0
1
μ
Spring 2010
EC3303
21
Expectations/Moments (of a r.v. Y)
Expectations/Moments (of a r.v. Y)

Often, we focus only on certain characteristics of the dist n P Y , e.g.,

“the middle value of all Y outcomes”,

“the most likely value of Y”,

“how scattered the range of all Y propable values are”,…

μ Y

Expectations/Moments (of a r.v. Y) Often, we focus only on certain characteristics of the dist P

the mean of Y (= the expected value of Y)

A measure of the center(ing) (counting in “prob”) of the dist n P Y

μ Y =

EY = y 1 f Y (y 1 ) + y 2 f Y (y 2 ) + + y k f Y (y k )

discrete Y with k outcomes

prob.’s as proper weights

= y yf Y (y) ( yf Y (y)dy continuous version)

σ² Y

Expectations/Moments (of a r.v. Y) Often, we focus only on certain characteristics of the dist P

the variance of Y (= the expected value of the“squared-deviations of Y from μ Y )

A measure of the dispersion (counting in “prob”) of the dist n P Y

σ² Y =

Var(Y) = (y 1 μ Y ) 2 f Y (y 1 ) + + (y k μ Y ) 2 f Y (y k ) discrete Y with k outcomes

= y (y μ Y ) 2 f Y (y) ( (y μ Y ) 2 f Y (y)dy continuous version)

Spring 2010

= E(Yμ Y ) 2

EC3303

22

μ Y

Expectations/Moments (of a r.v. Y)
Expectations/Moments (of a r.v. Y)
μ Y Expectations/Moments (of a r.v. Y) the mean of Y ( = the expected value

the mean of Y (= the expected value of Y)

A measure of the center(ing) (counting in “prob”) of the dist n P Y

μ Y =

EY = y 1 f Y (y 1 ) + y 2 f Y (y 2 ) + + y k f Y (y k )

discrete Y with k outcomes

prob.’s as proper weights

= y yf Y (y) ( yf Y (y)dy continuous version)

σ² Y

μ Y Expectations/Moments (of a r.v. Y) the mean of Y ( = the expected value

the variance of Y (= the expected value of the“squared-deviations of Y from μ Y )

discrete Y with k outcomes ∞ continuous version)
discrete Y with k outcomes
continuous version)

A measure of the dispersion (counting in “prob”) of the dist n P Y

Var(Y) = (y 1 μ Y ) 2 f Y (y 1 ) + + (y k μ Y ) 2 f Y (y k )

= y (y μ Y ) 2 f Y (y) ( (y μ Y ) 2 f Y (y)dy

= E(Yμ Y ) 2

σ² Y =

σ Y = √Var(Y) the standard deviation of Y
σ Y =
√Var(Y)
the standard deviation of Y

Spring 2010

EC3303

μ

23

Recall! (Important properties of μ and σ ² ) : If X = aY + b

Recall! (Important properties of μ Y and σ² Y ): Given a r.v. Y with μ Y and σ² Y

If X = aY + b for any non-random numbers a, b,

(a) EX = E(aY + b) = aEY + b = aμ Y + b

(b) Var(X) = Var(aY + b) = a² Var(Y) = a²σ² Y

Other useful (higher) moments (of a r.v. Y)

Recall! (Important properties of μ and σ ² ) : If X = aY + b

the skewness of Y A measure of the asymmetry(inclination) of the dist n P Y

E(Yμ Y ) 3

σ 3

Y

E(Yμ Y ) 4

Recall! (Important properties of μ and σ ² ) : If X = aY + b

the kurtosis of Y A measure of the tail thinckness of the dist n P Y

σ 4

Y

Recall! (Important properties of μ and σ ² ) : If X = aY + b

Recall! (Important properties of μ Y and σ² Y ): Given a r.v. Y with μ Y and σ² Y

If X = aY + b for any non-random numbers a, b,

(a) EX = E(aY + b) = aEY + b = aμ Y + b

Recall! (Important properties of μ and σ ² ) : If X = aY + b

(b) Var(X) = Var(aY + b) = a² Var(Y) = a²σ² Y

Other useful (higher) moments (of a r.v. Y)

Recall! (Important properties of μ and σ ² ) : If X = aY + b

the skewness of Y A measure of the asymmetry(inclination) of the dist n P Y

E(Yμ Y ) 3

σ 3

Y

E(Yμ Y ) 4

Recall! (Important properties of μ and σ ² ) : If X = aY + b

the kurtosis of Y A measure of the tail thinckness of the dist n P Y

σ 4

Y

2.3 Multiple RandomVariables (“More than one r.v.?”)

Remember! A “random variable”, Y
Remember! A “random variable”, Y

A “partition” a random variable

E E c
E
E c
  • Ω

Y

2.3 Multiple RandomVariables (“More than one r.v.?”) Remember! A “random variable”, Y A “partition” a random
  • 1 0

Y Bernoulli(p)

where p = P(E)
where p = P(E)

if E happens

if E c happens

A “partition” a random variable

… F 1 F 2 F 3 F n
F 1 F 2 F 3
F n


2.3 Multiple RandomVariables (“More than one r.v.?”) Remember! A “random variable”, Y A “partition” a random

y 1

…

with F 1

Rolling a dice n = 6

Y

…

Daily SGD vs. USD n =

y n

with F n