You are on page 1of 42

Applied Financial Econometrics using Stata

3. Linear Factor Models

Stan Hurn

Queensland University of Technology

Hurn (QUT) Applied Financial Econometrics using Stata 1 / 40


Introduction to .do Files

Hurn (QUT) Applied Financial Econometrics using Stata 2 / 40


The Problem

One of the most common problems in empirical asset pricing concerns the
estimation and evaluation of linear factor models. There is a large
literature on the econometric techniques to estimate and evaluate these
models which deals with the following questions.
how to estimate parameters
how to calculate standard errors of the pricing errors
how to test the model

Hurn (QUT) Applied Financial Econometrics using Stata 3 / 40


The Data

The data are monthly percentage returns for the period July 1926 to
December 2013 (T = 1050) on 25 portfolios (r1 to r25) sorted in terms of
size and book-to-market values together with the risk free (US Treasury
bill rate) and the return on the market (S&P500 index). The data are
freely available from Ken Frenchs website:
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

The series in the file fama french.dta are:


r1 r25 = monthly returns to the portfolios
rm rf = excess market return
rf = riskfree rate of return
I will upload another file famafrench.dta which will contain some
additional series for you to play around with.

Hurn (QUT) Applied Financial Econometrics using Stata 4 / 40


A First .do File

set more off


version 13
clear *

// set up a log file


capture log close capm
log using capm, name(capm) replace

// set current working directory


cd ~/Dropbox/Teaching/Singapore/do

// load the data and add a few labels


use fama_french.dta, clear
label variable rm "Market Return"
label variable rf "Risk Free Rate"

// format date variable and set data set as time series


format %td dateid01
tsset dateid01

Hurn (QUT) Applied Financial Econometrics using Stata 5 / 40


When Run ...

. set more off


. version 13
. clear *
.
. // set up a log file
. capture log close capm
. log using capm, name(capm) replace

name: capm
log: /Users/stanhurn/Dropbox/TEACHING/SIngapore/do/capm.smcl
log type: smcl
opened on: 13 Mar 2014, 17:24:36
.
. // set current working directory
. cd ~/Dropbox/Teaching/Singapore/do
/Users/stanhurn/Dropbox/TEACHING/SIngapore/do
.
. // load the data and add a few labels
. use fama_french.dta, clear
. label variable rm "Market Return"
. label variable rf "Risk Free Rate"
.

Hurn (QUT) Applied Financial Econometrics using Stata 6 / 40


Some plots

// format date variable and set data set as time series


format %td dateid01
tsset dateid01

// plot return on market and rf on same graph using same y-axes


twoway (tsline rm) (tsline rf), name(factors0, replace) ///
tlabel(,angle(forty_five) format(%tdCCYY)) xtitle("")

graph export "../factors0.pdf", as(pdf) replace

// plot return on market and rf on same graph using different y-axes


twoway (tsline rm, yaxis(1)) (tsline rf, yaxis(2)), name(factors, replace) ///
tlabel(,angle(forty_five) format(%tdCCYY)) xtitle("")

graph export "../factors.pdf", as(pdf) replace

Hurn (QUT) Applied Financial Econometrics using Stata 7 / 40


Plot of the Market Return and Risk Free Rate
40
20
0
-20
-40

30

40

50

60

70

80

90

00

10
19

19

19

19

19

19

19

20

20
Market Return Risk Free Rate

Hurn (QUT) Applied Financial Econometrics using Stata 8 / 40


Plot of the Market Return and Risk Free Rate

1.5
40
20

1
Risk Free Rate
Market Return
0

.5
-20

0
-40

30

40

50

60

70

80

90

00

10
19

19

19

19

19

19

19

20

20
Market Return Risk Free Rate

Hurn (QUT) Applied Financial Econometrics using Stata 9 / 40


Estimating a Simple CAPM

Hurn (QUT) Applied Financial Econometrics using Stata 10 / 40


One Factor Pricing Model

Define the excess returns zit = rit rf . If the pricing factor, ft is also an
excess return then the fundamental pricing model states that the excess
returns are linear in the betas

E(zit ) = E(ft ) .

This model is usually evaluated in the form of a time-series linear


regression
zit = i + i ft + uit .
Comparing the model and the expectation of the time-series regression, it
follows that all the regression intercepts i should be zero. In other words
the regression intercepts are equal to the pricing errors.

Hurn (QUT) Applied Financial Econometrics using Stata 11 / 40


When Run ...
. // estimate CAPM for first portfolio and test alpha = 0 and beta = 1
. reg z1 rm_rf
Source SS df MS Number of obs = 1021
F( 1, 1019) = 1103.83
Model 79600.6792 1 79600.6792 Prob > F = 0.0000
Residual 73483.5518 1019 72.1133973 R-squared = 0.5200
Adj R-squared = 0.5195
Total 153084.231 1020 150.082579 Root MSE = 8.492

z1 Coef. Std. Err. t P>|t| [95% Conf. Interval]

rm_rf 1.63492 .0492092 33.22 0.000 1.538357 1.731483


_cons -.5916513 .2676044 -2.21 0.027 -1.11677 -.0665325

.
. // test the model
. test _cons
( 1) _cons = 0
F( 1, 1019) = 4.89
Prob > F = 0.0273
. test rm_rf=1
( 1) rm_rf = 1
F( 1, 1019) = 166.47
Prob > F = 0.0000
. test (_cons=0) (rm_rf=1)
( 1) _cons = 0
( 2) rm_rf = 1
F( 2, 1019) = 83.49
Prob > F = 0.0000

Hurn (QUT) Applied Financial Econometrics using Stata 12 / 40


Estimation of the CAPM

There are at least four ways to estimate the simple CAPM for all 25
portfolios in Stata:
1 Equation-by-equation OLS. Loop over the excess returns and estimate
each equation.
2 Use the mveqn command. Performs equation-by-equation OLS
automatically.
3 Use the sureg command which performs seemingly unrelated
regressions.
4 Reshape the data as long format and use statsby prefix.

Hurn (QUT) Applied Financial Econometrics using Stata 13 / 40


The Commands

// generate excess returns


local N = 25
forvalues i = 1/N {
qui gen zi = ri - rf
}

drop r1-r9 // note use of hyphen r10 comes right after r1

// at least four ways to do this estimation

forvalues i = 1/N {
qui regress zi rm_rf
}
qui mvreg z* = rm_rf

qui sureg z* = rm_rf

qui reshape long z, i(dateid01) j(portfolio)


qui statsby _b _se, by(portfolio) saving(simplecapm, replace): reg z rm_rf

Hurn (QUT) Applied Financial Econometrics using Stata 14 / 40


The Reshape Command

. reshape long z, i(dateid01) j(portfolio)


(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25)
Data wide -> long

Number of obs. 1021 -> 25525


Number of variables 29 -> 6
j variable (25 values) -> portfolio
xij variables:
z1 z2 ... z25 -> z

Hurn (QUT) Applied Financial Econometrics using Stata 15 / 40


The Data in Long Format

Hurn (QUT) Applied Financial Econometrics using Stata 16 / 40


Testing Pricing Errors

Hurn (QUT) Applied Financial Econometrics using Stata 17 / 40


Basic Results

Recall from the results of the classical two-variable regression model


2
" #!
s2 f

b N 0 , 1+
T var(f )

where s 2 is the variance of the residuals of the regression.


The Wald test of the restriction that bi is zero (no pricing error in the i th
equation) is then given by dividing the coefficient estimate squared by its
variance !#1
2
"
f bi2

J =T 1+ 21
var(f ) s2

Hurn (QUT) Applied Financial Econometrics using Stata 18 / 40


Joint Wald Test

We also want to know if all the pricing errors are jointly equal to zero. We
now have to think of the time-series regressions as a panel regression with
correlated errors, E(uit ujt ) 6= 0. The classic form of the test assumes no
autocorrelation or heteroskedasticity so the the Wald test of the joint
restrictions is given by
" 2
!#1
f 1 a
J=T 1+ b0
b
b 2N
var(f )
1
b0
= aT b
b.

b = [1 2 N ]0 and
with b is the residual covariance matrix. For
convenience, the test is often written just with a positive scaling constant
aT which depends on the sample size and the factor.

Hurn (QUT) Applied Financial Econometrics using Stata 19 / 40


Gibbons, Ross, Shanken Test

The Wald test is asymptotical valid. A finite-sample F test is also


available, known as the Gibbons, Ross, Shanken or GRS test, given by
" 2
!#1
T N 1 f 1
GRS = 1+ b0
b
b FN,T N1
N var(f )

The F distribution recognises the sample variation in the estimation of b


which is not accounted for the asymptotic Wald version. This distribution
requires that the errors are normally distributed as well as uncorrelated and
homoskedastic.

Hurn (QUT) Applied Financial Econometrics using Stata 20 / 40


Multiple Factors

The test does generalise to the case of multiple factors. Assuming normal
iid errors the test statistic is
T N K h 0 b 1 1 0 b 1
i
GRS = 1+f f
b b FN,T NK
N
in which

N = number of assets
K = number of factors
f = ET (ff )
T
b= 1
X
(ff t f )(ff t f i )0
T
t=1

Hurn (QUT) Applied Financial Econometrics using Stata 21 / 40


GRS in Stata (Wald Version)

. // Gibbons Ross Shanken test (using seemingly unrelated regression estimator)


. qui sureg (z* = rm_rf)
.
. // Wald version
. qui test _cons
. qui sca grsW = r(chi2)
. qui sca pval = r(p)
.
. di as text "Degrees of freedom = " as res r(df)
Degrees of freedom = 25
. di as text "Gibbons Ross Shanken test (Wald Version) = " as res grsW
Gibbons Ross Shanken test (Wald Version) = 96.631577
. di as text "p-value = " as res pval
p-value = 2.302e-10

Hurn (QUT) Applied Financial Econometrics using Stata 22 / 40


GRS in Stata (F Version)

. // F version
. sca tmp0 = (`T-`N-1)/`N
. sca tmp1 = grsW/`T
. sca grsF = tmp0 * tmp1
. sca pvF = Ftail(`N,`T-`N-1,grsF)
.
. di as text "Gibbons Ross Shanken test (F Version) = " as res grsF
Gibbons Ross Shanken test (F Version) = 3.7668333
. di as text "p-value = " as res pvF
p-value = 1.958e-09

Hurn (QUT) Applied Financial Econometrics using Stata 23 / 40


GRS in Mata (Wald Version)

. // Estimate seemingly unrelated regression model (cheating!)


. qui sureg (z* = rm_rf)
.
. // now call mata (e(b) is 1x50 so must reshape
. // reshape to have 25 rows using rowshape()
. mata:
mata (type end to exit)
: aT = st_numscalar("aT")
: sigma = st_matrix("e(Sigma)")
: nf = strtoreal(st_local("N"))
: b = st_matrix("e(b)")
: bmat = (1::nf),rowshape(b,nf)
: st_matrix("suregb", bmat)
: end

Hurn (QUT) Applied Financial Econometrics using Stata 24 / 40


GRS in Mata (Wald Version)

. // need to drop variables so matrix does not take dimensions of data set in memory
. drop *
.
. // stata view of the matrix has 3 columns (company # slope and constant)
. // name them and use the names to break the matrix into variables
. mat colnames suregb = company beta alpha
. qui svmat suregb, names(col)
. mata
mata (type end to exit)
: st_view(alpha=.,.,"alpha")
: J = aT * alpha * invsym(sigma) * alpha
: J
96.6328742
: end

Value returned for the GRS test is 96.6328742 which is (almost) identical
to that obtained previously.

Hurn (QUT) Applied Financial Econometrics using Stata 25 / 40


Cross Section Regressions

Hurn (QUT) Applied Financial Econometrics using Stata 26 / 40


Price of Risk

The central question of interest is why average returns vary across assets.
The answer is that the expected returns should be high if the asset has a
high exposure to the factors that carry large risk premia. Recall the
fundamental pricing model with a single factor in which the excess returns
are linear in the betas
E(zit ) = i E(ft ) .
Since the factor, ft , is also an excess return, the model applies to the
factor as well
E(ft ) = 1
where is the price of risk (risk premium) associated with the factor so
that
E(zit ) = i .

Hurn (QUT) Applied Financial Econometrics using Stata 27 / 40


Two-pass Regression

A natural idea is then to store estimates of i from the time-series


regressions and then estimate the factor risk premium from a
cross-sectional regression of average returns on the i

ET (zit ) = i + i .

The cross-sectional regression residuals i are the pricing errors.

Hurn (QUT) Applied Financial Econometrics using Stata 28 / 40


Using collapse

A powerful Stata command which can be used to implement the two-pass


estimator is

collapse clist [if] [in] [weight] [, options]

collapse converts the dataset in memory into a dataset of means, sums,


medians, etc. or any summary statistic contained in clist which must refer
to numeric variables exclusively.

Hurn (QUT) Applied Financial Econometrics using Stata 29 / 40


First Pass
. // reshape the data to long form
. reshape long z, i(dateid01) j(portfolio)
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25)
Data wide -> long

Number of obs. 1021 -> 25525


Number of variables 30 -> 7
j variable (25 values) -> portfolio
xij variables:
z1 z2 ... z25 -> z

. save "./working/famafrenchlong.dta", replace


file ./working/famafrenchlong.dta saved
.
. // first pass regression
. statsby _b, by(portfolio) saving("./working/firstpass",replace) nodots: reg z rm_rf
command: regress z rm_rf
by: portfolio
.
. // keep estimated betas
. use "./working/firstpass.dta", clear
(statsby: regress)
. ren _b_rm_rf betas
. drop _b_cons
. save "./working/coefs.dta", replace
file ./working/coefs.dta saved

Hurn (QUT) Applied Financial Econometrics using Stata 30 / 40


Using Collapse
. // now collapse the data to a pure cross section
. use "./working/famafrenchlong.dta", clear
. collapse (mean) z, by(portfolio)
.
. merge 1:1 portfolio using "./working/coefs"
Result # of obs.

not matched 0
matched 25 (_merge==3)

. drop _merge
.
. // second pass regression
. reg z betas, noconstant
Source SS df MS Number of obs = 25
F( 1, 24) = 31.28
Model 4.82079646 1 4.82079646 Prob > F = 0.0000
Residual 3.69828055 24 .154095023 R-squared = 0.5659
Adj R-squared = 0.5478
Total 8.51907702 25 .340763081 Root MSE = .39255

z Coef. Std. Err. t P>|t| [95% Conf. Interval]

betas 1.555446 .2780928 5.59 0.000 .9814903 2.129401

Hurn (QUT) Applied Financial Econometrics using Stata 31 / 40


Fama-MacBeth Regressions

The Fama-MacBeth (1973) approach estimates cross section


regressions for each time period
zit = t i + uit
Having obtained these estimates, the Fama-MacBeth procedure then
computes
XT
=
b
bt ,
i=1
as the estimated price of risk.
The standard errors of these parameters are the sample standard
deviations from the cross-sectional regressions, defined as
h P i
1 T b 2
bt ) " T #
T t=1 ( 1 X
b2 ()
b = = 2 ( b 2 ,
bt )
T T
t=1

Hurn (QUT) Applied Financial Econometrics using Stata 32 / 40


Second Pass Regression

. // this time merge the estimated betas into the long data set
. merge m:1 portfolio using "./working/coefs.dta"
Result # of obs.

not matched 0
matched 25,525 (_merge==3)

. drop _merge
. save "./working/famafrenchlong.dta", replace
file ./working/famafrenchlong.dta saved
.
. // run the regressions for each time period
. statsby _b, by(dateid01) saving("./working/famamacbeth",replace) nodots: reg z betas, nocon
command: regress z betas, noconstant
by: dateid01
.
. use "./working/famamacbeth.dta", replace
(statsby: regress)
. sum _b_betas
Variable Obs Mean Std. Dev. Min Max

_b_betas 1021 1.555446 11.53178 -36.97755 143.2732

Hurn (QUT) Applied Financial Econometrics using Stata 33 / 40


Interesting New Developments

Hurn (QUT) Applied Financial Econometrics using Stata 34 / 40


Large Numbers of Assets

The general Wald form of the Gibbons, Ross, Shanken test for zero pricing
errors (b
= 0) in a linear factor model is

b0
J = aT bb 2N

in which, b is the estimated covariance matrix of the errors, aT is a


positive scaling constant and N is the number of assets being tested.

Hurn (QUT) Applied Financial Econometrics using Stata 35 / 40


Large Numbers of Assets

The general Wald form of the Gibbons, Ross, Shanken test for zero pricing
errors (b
= 0) in a linear factor model is

b0
J = aT bb 2N

in which, b is the estimated covariance matrix of the errors, aT is a


positive scaling constant and N is the number of assets being tested.

This test is applicable, however, only when the number of assets N is


much smaller than the length of the time series T . When N > T the
sample covariance b becomes degenerate. In practise, one typically picks a
testing period of T = 60 monthly data and does not increase the testing
period any longer, because the factor pricing model is technically a
one-period model whose factor loadings can be time-varying. If you are
looking at lots of assets this constitutes a problem.

Hurn (QUT) Applied Financial Econometrics using Stata 35 / 40


Pesaran and Yamagata (2012)
To overcome the difficulty, Pesaran and Yamagata (2012, PY test) suggest
ignoring the correlations among assets and constructing a test statistic
under working independence by setting V = diag( b )1 . They derive the
following result for the distribution of the standardised quadratic form

b0 V
aT bN
Js = p N(0, 1)
2N(1 + eT )

with
1 X 2 2
et = ij > cT )
bij I(b
N
i6=j
1 1
cT = (1 c/N)
T
c (0, 0.5).

Hurn (QUT) Applied Financial Econometrics using Stata 36 / 40


PY (Basic Version)

. // basic version of Pesaran Yamagata test with sqrt(2*N) as the scaling factor
. mata:
mata (type end to exit)
: st_view(s2=.,.,"s2")
: st_view(alpha=.,.,"alpha")
: N = strtoreal(st_local("N"))
: aT = st_numscalar("aT")
: Jnum = aT :* alpha * invsym(diag(s2)) * alpha - N
: Jden = sqrt(2 * N)
: J = Jnum :/ Jden
: pval = 1 - normal(abs(J))
: Jnum, Jden, J, pval
1 2 3 4

1 -6.500208749 7.071067812 -.9192683372 .1789776174

: end

. // Notes:
. // 1. the test strongly rejects for whole sample
. // 2. this value is calculated using T = N so it works when GRS would fail
. // 3. now we have a problem with low power

Hurn (QUT) Applied Financial Econometrics using Stata 37 / 40


PY (Basic Version)

. // more advanced implementation of the test


. // uses sqrt(2*N*(1+eT)) as the scaling factor in the denominator
. mata:
mata (type end to exit)
: //Janum = aT :* alpha * invsym(diag(s2)) * alpha - N
: Jaden = sqrt(2 * N * (1 + eT))
: Ja = Jnum :/ Jaden
: pval1 = 1 - normal(abs(Ja))
:
: Jnum, Jaden, Ja, pval1
1 2 3 4

1 -6.500208749 8.712846629 -.7460487974 .227818969

: end

. // Notes:
. // 1. the numerator of the test is identical to the previous one
. // 2. eT=0.5 so denominator is only slightly affected
. // 3. value of this adjustment is questionable (need some MC evidence)

Hurn (QUT) Applied Financial Econometrics using Stata 38 / 40


Power Enhancements

The PY test, or any other genuine quadratic statistic, is powerful only


when a non-negligible fraction of assets are mispriced. Indeed, the factor
N above reflects the noise accumulation in estimating N parameters in the
vector .
A new working paper by Fan, Liao and Yao (2013) proposes an interesting
method which uses power enhancements (PEM) to improve the power
of the asset pricing test statistic as follows:
1 Compute J1 , a test statistic that has the correct asymptotic size (e.g.,
GRS, PY) but which may suffer from small power.
2 Compute a PEM component test J0 that has two properties:
p
1 J0 0 under H0 .
2 J0 does not converge to 0 but even diverges when the true parameters
fall into a subset of the alternative hypothesis.
3 Compute the PEM test J = J0 + J1 .
Hurn (QUT) Applied Financial Econometrics using Stata 39 / 40
Suggestion: Screened Wald Test
Of course the trick here is to find a statistic J0 which has these nice
properties. Fan, Liao and Yao (2013) propose a screened Wald test

J0 = NaT b 0s
bs
bs

in which b s is a subset of the original vector of estimated


b whose value
exceeds some threshold value T and u is the corresponding submatrix of
b
the original weight matrix b.

Hurn (QUT) Applied Financial Econometrics using Stata 40 / 40


Suggestion: Screened Wald Test
Of course the trick here is to find a statistic J0 which has these nice
properties. Fan, Liao and Yao (2013) propose a screened Wald test

J0 = NaT b 0s
bs
bs

in which b s is a subset of the original vector of estimated


b whose value
exceeds some threshold value T and u is the corresponding submatrix of
b
the original weight matrix b.

Screening procedure
r
log N
T = log(log T )
T
Choose
bi if
|b
i |
> T

bj

Hurn (QUT) Applied Financial Econometrics using Stata 40 / 40

You might also like