You are on page 1of 56

MFIN6201 Lecture 1

An Overview of Econometrics

Raphael Park
May 30, 2021

1
Who’s talking?

• Raphael Park, jonghyeon.park@unsw.edu.au


• Research areas include Corporate Finance, Nonprofits, and fund
management.
• Consultation by appointments. Please email me if you would like to have
a consultation.
• More flexible before final exams

2
Course Admins

• Questions re. the course content should be posted on Moodle Discussion


Forum, I will not reply emails in this regard
• Questions re. the course admin should be posted on Moodle ”Questions
about Course Admin” forum. Email me if it is personal (such as special
consideration)
• Group formation is also done via Moodle, you have to select a group
before week 3. A group can have 5-8 people, not less, not more.
• You will need the UNSW myaccess to complete this course, since we use
a non-free software called Stata. (more on this later)
• Lecture recordings will be provided after each one.
• The textbook is Stock Watson (SW), but it is for reference only. Lecture
Slides + Homework will be sufficient for you to succeed in the exams.

3
Course Schedule

• Week 2 - Programming in Stata


• Week 3 - Statistics, Probability theory, Mathematics and intro to OLS
• Week 4 - OLS regression I
• Week 5 - OLS regression II
• Week 6 - Flexibility week - we will have a discussion lecture
• Week 7 - IV regressions
• Week 8 - Panel Data and Fixed Effect Regressions
• Week 9 - Experiments
• Week 10 - Machine Learning

4
Assessment

The course outline is bit outdated due to the virus situation, please refer to
slides and materials on Moodle.

• Participation (10%) - based on finishing homework each week - there will


be 8 homework in total, each one worths 1.25. Each homework should be
submitted to the link under each week’s tab.
• Written assignment (10%) - Due in week 4
• Group projects (20%) - Written group report due in week 10.
• Peer evaluations (10%) - The average of how your teammates grade you.
• Final exam (50%) - University Exam Period

5
Overview of today’s lecture

Read (SW Chapters 1, 2, 3) Topics

• The probability framework for statistical inference


• Estimation
• Hypothesis testing
• Confidence interval

6
What is Econometrics?

The science and art of using economic theory and statistical


techniques to analyze economic data.

• Statistics + economics
• Standard assumptions in statistics
• Nature of economic data
• Financial econometrics = econometrics + finance
• Use econometrics techniques to study a variety of problems
from finance
• Focus on hypothesis testing, causal inference and forecasting

7
Types of data in finance

• Studies varying across entities (e.g., firms, individuals, etc.):


cross-sectional data
• Studies varying across time: time-series data
• Study variations both across firms and through time using
panel data

Can you visualize the data structure?


Cross-sectional vs. time series vs. panel data
Answer: See Stock and Watson Chapter 1.3

8
Brief overview of the first part of the course

Economics and financial theories suggest important relations, often


with policy implications, but virtually never suggests quantitative
magnitudes of causal effects.
• What is the quantitative effect of independent board members
on firm performance?
• What is the quantitative effect of high frequency trading on
market quality?
• What is the quantitative effect on asset prices of a 1
percentage point increase in interest rates by the Fed?
• What is the magnitude for the price of risk across different
financial assets?
We need to carry out empirical analysis to answer these questions
with rigor!
9
How to carry out empirical analysis in finance?

Example: medical research

• Treatment group : given new drugs


• Control group : given placebo (usually vitamin)
• Treatment effect : difference between the two groups

Example: financial research

• Firms with high R&D expense vs with low R&D expense


• Firms that invest in social responsibility perform better?

10
How to carry out empirical analysis in finance?

But almost always we only have observational (non-experimental)


data

• Independent board of directors and algorithmic traders


• Monetary policy

Part of the course deals with difficulties arising from using


observational to estimate causal effects

• confounding effects (omitted factors)


• simultaneous causality
• “correlation does not imply causation”

11
An empirical example: class size and education output

• Question: What is the effect on test scores (or some other


outcome measure) of reducing class size by one student per
class? by 8 students/class??
• We must use data to find out (is there any way to answer this
without data?)

12
Using data for empirical analysis

The California Test Score Data Set - see SW chapter 1.3

• California school districts (n = 420) in 1999


• Variables:
• fifth grade test scores (combined math and reading),
• district average Student-teacher ratio (STR) = no. of students
in the district divided by no. full-time equivalent teachers

13
Initial look at the data

What is the relationship between test scores and the STR?

14
Initial look at the data

What does this figure show? Answer: Eyeball econometrics

15
Eyeball econometrics is not rigorous enough!

We need to get some numerical evidence on whether districts with low STRs
have higher test scores but how?

• Compare average test scores in districts with low STRs to those with high
STRs (“estimation”).
Point estimation involves the use of sample data to calculate a single
value (known as a statistic), which is served as the “best estimate” of an
unknown (fixed or random) population parameter.
• Test the null hypothesis that the mean test scores in the two types of
districts are the same, against the “alternative” hypothesis that they
differ (“hypothesis testing”).
Testing a hypothesis on the basis of observing a process that is modeled
via a set of random variables.
• Estimate an interval for the difference in the mean test scores, high v.
low STR districts (“confidence interval”). A confidence interval is a
type of interval estimate of a population parameter.

16
What is next?

• Or, you can use (linear) regression to measure the relation (i.e., the
slope) between student-to-teacher ratios and average test scores.
• Why linear? - it is easy to implement, easy to interpret and it gives the
best linear approximation even the relationship is not linear.
• Why regression? - it went through a robust mathematical procedure to
get the estimate (hard to beat with alternatives like eyeballing)
• Before turning to regression, however, we will review some of the
underlying theory of estimation, hypothesis testing, and confidence
intervals:
• Why do these procedures work, and why use these rather than
others?
• We will review the intellectual foundations of statistics and
econometrics

17
Review of Statistical Theory

• Probability framework for statistical inference


• Estimation
• Testing
• Confidence Intervals

18
Probability framework for statistical inference

• Population, random variable, and distribution


• Moments of a distribution (mean, variance, standard
deviation, covariance, correlation)
• Conditional distributions and conditional means
• Distribution of a sample of data drawn randomly from a
population: X1 , · · ·, Xn

19
Population, random variable, and distribution

Population
• The group or collection of ALL possible entities of interest
(school districts)
• We will think of populations as infinitely large
• e.g. Population Average of height of human being...? (only
God knows)
Random variable
• Random variable is a variable whose value is subject to
variations due to chance
• As a result, it can take on a set of possible different values
each with an associated probability
• Numerical summary of a random outcome (district average
test score, district STR)
20
Discrete random variable

Example 

 1 with probability 0.2

 2 with probability 0.3
X =

 3 with probability 0.3

4 with probability 0.2

21
Discrete random variable: pdf

• pdf (probability density function)

f (x) ≡ P(X = x)

- The probability for a random variable X to be equal to a given


constant, x

22
Discrete random variable: cdf

• cdf (cumulative distribution function)

f (x) ≡ P(X = x)

- The probability for a random variable X to be less than or equal


to a given constant, x

F (1) = P(X ≤ 1) = 0.2 = 0.2


F (2) = P(X ≤ 2) = 0.2 + 0.3 = 0.5
F (3) = P(X ≤ 3) = 0.2 + 0.3 + 0.3 = 0.8
F (4) = P(X ≤ 4) = 0.2 + 0.3 + 0.3 + 0.2 =1

23
Discrete random variable: cdf

24
Relation between pdf and cdf

• cdf is a sum of pdf

F (x) = f (x) + f (x − ∆) + · · ·
= f (x) + F (x − ∆)

• pdf is a difference of cdf

f (x) = F (x) − F (x − ∆)

25
Continuous random variable

For example,
• Normal distribution (bell-shaped curve)

X ∼ N(µ, σ 2 )
: X is drawn from a normal distribution with mean µ and variance
σ2
• pdf of the normal distribution

26
Continuous random variable

• pdf of continuous distribution is somehow different from the


pdf of discrete distribution because P(X = x) = 0
• Let’s begin with cdf

X = P(X ≤ x)

27
pdf and cdf of continuous distribution

cdf is an integration of pdf


Z x
F (x) = f (t)dt
−∞

pdf is a differentiation of cdf


d
f (x) = F (x)
dx

28
pdf and cdf

cdf

• Discrete: F (x) =
P
t≤x f (t)
Rx
• Continuous: F (x) = −∞ f (t)

pdf

• Discrete: f (x) = F (x) − F (x − ∆)


• Continuous: f (x) = d
dx F (x)

29
Probability density of stock returns

30
Moments: mean

• Mean (µ, average, expectation, expected value, 1st moment)


( P
n
xp (discrete)
E [X ] ≡ R ∞i=1 i i
−∞ xf (x)dx (continuous)

• Example


 1 with probability 0.2

 2 with probability 0.3
X =

 3 with probability 0.3

4 with probability 0.2

E [X ] = 1 · 0.2 + 2 · 0.3 + 3 · 0.3 + 4 · 0.2


= 2.5

31
Moments: variance

• Variance (σ 2 , 2nd moment)

Var [X ] ≡ Ep[(X − µ)2 ]


( P
n
(x − µ)2 pi (discrete)
= R ∞i=1 i 2
−∞ (x − µ) f (x)dx (continuous)

• Standard deviation
p
std(X ) = Var [X ]

• Example

Var (X ) = (1 − 2.5)2 · 0.2 + (2 − 2.5)2 · 0.3


+ (3 − 2.5)2 · 0.3 + (4 − 2.5)2 · 0.2
32
= 1.05
Useful formulas

• Expectation

E [a + bX ] = a + bE [X ]
E [X + Y ] = E [X ] + E [Y ]
Combined,
E [aX + bY ] = aE [X ] + bE [Y ]

• Variance

Var [a + bX ] = b 2 Var [X ]
Var [X + Y ] = Var [X ] + Var [Y ] + 2Cov [X , Y ]
Combined,
Var [aX + bY ] = a2 Var [X ] + b 2 Var [Y ] + 2abCov [X , Y ] 33
Moments: skewness, kurtosis

E [(X −µ)3 ]
• Skewness (3rd moment) = σ3
• measure of asymmetry of a distribution
• skewness = 0: distribution is symmetric
• skewness > (<) 0: distribution has long right (left) tail

E [(X −µ)4 ]
• Kurtosis (4th moment) = σ4
• measure of mass in tails
• measure of probability of large values
• kurtosis = 3: normal distribution
• kurtosis > 3: heavy tails (leptokurtotic)

34
Skewness and Kurtosis

35
Covariance

• Random variables X and Y have a joint distribution


• Covariance between X and Y is
Cov (X , Y ) = E [(X − µx )(Y − µy )] = σxy
• The covariance is a measure of the linear association between
X and Y
• cov(X, Y ) > 0 means a positive relation between X and Y
• If X and Y are independently distributed, then Cov(X,Y) = 0
(but not vice versa!! - consider X ∼ N(0,1) and Y = X 2 )
• The covariance of a random variable with itself is its variance

36
Correlation

• Correlation is defined as

Cov (X , Y ) σxy
Corr (X , Y ) = p = = ρxy
Var (X )Var (Y ) σx σy

• −1 ≤ corr (X , Y ) ≤ 1
• Cov(X,Y) > 0 means a positive relation between X and Y
• Corr(X,Y) = 1 mean perfect positive linear association
• Corr(X,Y) = -1 means perfect negative linear association
• Corr(X,Y) = 0 means no linear association

37
Correlation examples

• Two Bernoulli random variables (thinking about flipping coins)


(
1 with probability p
X,Y =
0 with probability 1 − p

• Mean

E [X ] = p · 1 + (1 − p) · 0 = p

• Variance

Var [X ] = E [(X − µ)2 ]


= p · (1 − p)2 + (1 − p) · (0 − p)2
= p(1 − p)
38
Correlation examples

• Example 1: independent distributions

Y/X 1 0
1 p2 p(1 − p)
0 p(1 − p) (1 − p)2

• Covariance

Cov (X , Y ) = E [(X − µx )(Y − µy )]


= p 2 · (1 − p)(1 − p) + p(1 − p) · (1 − p)(0 − p)
+ p(1 − p) · (0 − p)(1 − p) + (1 − p)2 · (0 − p)(0 − p)
=0

39
Correlation examples

• Correlation

Cov (X , Y )
Corr (X , Y ) = p =0
Var (X )Var (Y )

• Thus, independent distribution implies zero


covariance/correlation, but not vice versa.

40
Correlation examples

• Example 2: perfect correlation

Y/X 1 0
1 p 0
0 0 1-p

• Covariance

Cov (X , Y ) = E [(X − µx )(Y − µy )]


= p · (1 − p)(1 − p) + 0 · (1 − p)(0 − p)
+ 0 · (0 − p)(1 − p) + (1 − p) · (0 − p)(0 − p)
= p(1 − p)
= Var (X )
Thus, perfect correlation implies that covariance is equal to one’s
41
own variance
Correlation examples

• Correlation

Cov (X , Y ) p(1 − p)
Corr (X , Y ) = p = =1
Var (X )Var (Y ) p(1 − p)

• Thus, perfect correlation implies that correlation is equal to


one

42
Correlation

43
Correlation

• Corr(X, Y) ∈ (0, 1)
: When X is high, Y is likely, but not perfectly, to be high
• Corr(X, Y) ∈ (-1, 0)
: When X is high, Y is likely, but not perfectly, to be low
• Note: correlation does not imply causality
• Example: More education is related to more salary.
• But does it mean getting more education will definitely
increase the salary? No! Maybe you earn more salary then you
have the money to get education, or higher education means
better family background..!

44
Bayes’ theorem

• The distribution of Y, given value(s) of some other random


variable, X
T
P(A B)
P(A|B) =
P(B)
P(B|A)P(A)
=
P(B)

45
Conditional probability: example

• Example: what is the probability for the first child to be a son


if at least one of the two children is known to be a son?

2
P(first is a son | at least one is a son) =
3
P(first is a son & at least one is a son)
=
P(at least one is a son)
1/2
=
3/4
2
=
3

46
Textbook example: Table 2.2

• Joint distribution

47
Textbook example: Table 2.2

• Marginal distribution

n
X
P(Y = y ) = P(X = xi , Y = y )
i=1

• From the example,

P(rain) = 0.15 + 0.15 = 0.30


P(no rain) = 0.07 + 0.63 = 0.70
P(long commute) = 0.15 + 0.07 = 0.22
P(short commute) = 0.15 + 0.63 = 0.78

48
Textbook example: Table 2.2

• Conditional distribution
P(X = x and Y = yB)
P(Y = y |X = x) =
P(X = x)

• From the example,

0.15
P(long commute|rain) = = 0.50
0.30
0.15
P(rain|long commute) = = 0.68
0.22
0.63
P(short commute|no rain) = = 0.90
0.70

49
Conditional probability example: AIDS testing

• Question
The probability that a patient has HIV is 0.001 and the
diagnostic test for HIV can detect the virus with a probability
of 0.98. Given that the chance of a false positive is 6%, what
is the probability that a patient who has already tested
positive really has HIV?

50
Conditional probability example: AIDS testing

• The following information is given from the question,

P(HIV ) = 0.001
P(positive | HIV ) = 0.98
P(positive | not HIV ) = 0.06

• The question asks

P(HIV | positive) =???

51
Conditional probability example: AIDS testing

• Marginal distribution & Bayes’ theorem

P(positive) = P(positive&HIV ) + P(positive&not HIV )


= P(positive|HIV ) · P(HIV ) + P(positive|not HIV ) · P(not HIV )
= 0.98 × 0.001 + 0.06 × 0.999
= 0.06092

52
Conditional probability example: AIDS testing

• Bayes’ theorem

P(positive|HIV ) · P(HIV )
P(HIV | positive) =
P(positive)
0.98 × 0.001
=
0.06092
= 0.0161

53
Conditional means

• Conditional expectations and conditional moments

E (Y |X = x)

• Example: E (test scores|STR < 20) = the mean of test scores


among districts with small class sizes
• Conditional variance: variance of conditional distribution

54
Conditional means

• Do you remember the classroom-size example?

∆ = E (test scores|STR < 20) − E (test scores|STR ≥ 20)

Other examples of conditional means:

• Wages of all female workers (Y = wages, X = gender)


• Mortality rate of those given an experimental treatment
(Y =live/die; X = treated/not treated)
• If E (X |Z ) = const, then corr(X,Z) = 0 (not necessarily vice
versa however)

The conditional mean is a (possibly new) term for the


familiar idea of the group mean
55
Homework!

• Try to summarize an academic paper that you are interested


in.
• The main point is to illustrate the research question, empirical
challenges, and how the authors overcame the challenges.
• Choose an article in the following journals: Journal of Finance,
Journal of Financial Economics, Review of Financial Studies.
• Due on the day before the next lecture.
• Efforts are more important than getting the correct answer...!

56

You might also like