You are on page 1of 5

Data set:

Y= industry performance score (1-10)

X= industry competitiveness score (0-100)

Assumint a positive relationship between industry competitiveness and industry performance.

1º. Set Stata to handle panel data  xtset n t

With this we are saying to STATA that we have a panel data in which there are 3 cases with
three observations per year.

2º. Check the individual-specific heteroheneity (Fixed effects model) manually.

- We sort units and compute the unit-specific mean values for each variable
Bysort n: egen avg_y= mean(y)
Bysort n: egen avg_x= mean(x)
Bysort n: egen avg_t13 = mean(t13)
Bysort n: egen avg_t14 = mean(t14)
Bysort n: egen avg_t15 = mean(t15)

- We compute the within differences for each variable.


gen d2_y = y - avg_y
gen d2_x = x - avg_x
gen d2_t13 = t13 - avg_t13
gen d2_t14 = t14 - avg_t14
gen d2_t15 = t15 - avg_t15

Then, we can plot checking the individual-specific heterogeneity:

- Across units: cross-individual effects  three cases with three observations each
one

twoway scatter y n, msymbol(circle_hollow) ||


connected avg_y n, msymbol(diamond) ||, xlabel(1 "1"
2 "2" 3 "3")

- Across time: cross-time effects  three individual values (for the three cases) per
each year

bysort t: egen avg_yt = mean(y)


twoway scatter y t, msymbol(circle_hollow) || connected
avg_yt t, msymbol(diamond) xlabel(2013(1)2015)
3º. Estimation of the parameters using: Fixed-effects or random-effects

Imagine that we compute this data with OLS:

The problem here is the overestimation of the coeficients due to this model consider all
observations like a diferent subjects and this is not true because we dont have 9 individual
observations, we have 3 cases (groups) with three observations measured over time (three
years). And this assumption dont consider that there are some characterisctics non observed
which its individual errors are correlated with the observations. This is the reason why we
should use panel data.

Fixed effects

xtreg y x t13 t14 t15, fe


Here STATA understand that we have a panel data with 9 observations and 3 groups. To
analyse this results we should pay attention to:
- R-sq within = 0.9977
- Corr (u_i, Xb) = 0.3044  with FE we Split the error between observables and
unobservable variables in order to minimize the overestimation problem of the OLS
and control the endogeneity resulting from omitted variables (error term is
correlated with regressors, our factors should be correlated with the indep.
Variable)
- Coef. as we can see they are exactly equal to the OLS ouput but now the
significance is not overestimated.
- Prob > F = 0,0002  model fit
- Rho: 0,7678 (Prob > f = 0,0793)  this infor us for the quality of our model. If it is
not sign. our model don´t fit well. This test that all individual effects are zero (Ho).
The 77% of the variance is explained by cross-unit differences.

Interpretation of the coeficients: how much Y changes when X increases by one unit.

Random effects

xtreg y x t13 t 14, re

To analyse this results we should pay attention to:


- R-sq overall = 0.9898
- Corr (u_i, Xb) = 0  with RE we assume that difference across units are
uncorrelated with the reggresors, that means that sometimes is not necessary assum
that unobservable variables should be correlated with our observable variables
(reggresors) because they are totally exogenous (we cannot control them).
- Coef. as we can see they are totally diferent from FE because here we work with
only one regresion with an overall average.
- Prob > F = 0  model fit
- Rho: 0.7745  ¿?
Interpretation (take care): overall average effect of a unit change of X over Y when X changes
between observations and across time.

4º. Specification test: Hausman

H0: Random-effects model displays consistent coeficients (both are suitable)


H1: Random-effects models suffer from inconsistency, whereas fixed-effectsmodels are
consistent and eficient (only fixed effect)

Fe is always the best practice, we should analyse if RE is a good practice too.

This rest plot the difference between coeficients from models FE-RE to test if this differences
are significant or not:
- NO reject Ho  we can use both models because they are consistent and comprable
(give as same inforamtion even the coeficients values are not the same)
- Reject Ho  we can only use fixed effects.

To do it we should repeat:
Step 1: FE model  xtreg y x t13 t14, fe
Step 2: store FE results  est store fe
Step 3: RE model  xtreg y x t13 t14, fe
Step 4: Hausman test  hausman fe

You might also like