You are on page 1of 11

Addis Ababa University

College of Business and Economics


Department of Economics

Econometrics II, Econ 654


Assignment on Panel Data Econometrics

Prepared By:
Zemichael Seltan --------- ID: GSE/3108/15

Submitted to: Fantu Guta (PhD)

Jan 2024

Q#1
The data PDASS.dta was imported and it was declared for panel data in Stata.

1
xtset ident year

a) Estimation of the Fixed Effects Model


First the variable “weeks” was transformed into log format and the fixed effect regression was
run in Stata:
xtreg logwage logweeks industry north city married female union educ black, fe
The result of the regression is shown on the table below and
 The R2 value is very weak, 0.0083.
 There is a correlation between ui and xb as expected, which is -0.4890.
 The model as a whole is significant with prob>f=0.0000.
 The model did not estimate some of the time invariant variables; “female”, “edu”
and “black”.
 The coefficients of the following variables are insignificant: “logweeks”,
“industry”, and “north” at 5% significant level.
 The coefficients of “city” and “married” donot have realistic meaning. That is,
being born in a city and married should not decrease the wage from those who are
not living in a city and not married respectively.
 The estimated model is:

2
Logwage = 6.589 + 0.052logweeks + 0.031industry + 0.002north – 0.125city –
0.077married + 0.054union

Estimation of the Random Effects Model


The regression in Stata was run:
xtreg logwage logweeks industry north city married female union educ black, re
The regression results as shown on below table are:
 The model is significant with prob>chi2 = 0.0000.
 The correlation between ui and X is zero as expected.
 The model has estimated all the given variables, and all are significant except the
variable “married”.
 The estimated coefficients of the variables have realistic meaning except that of
“married”.
 The estimated model is:

3
Logwage = 5.593 + 0.068logweeks + 0.050industry - 0.077north + 0.075city –
0.012married -0.440female+ 0.057union + 0.064educ – 0.145black.

Which model is preferred?


To decide which model is preferable, we use the Hausman’s test. The null and alternate hypotheses of
the test are:
 H0: Random Effects is preferred.
 H1: Fixed Effects is preferred.
Running the hausman fixed random command in Stata results in the below values as shown on below
table. We can reject the null hypothesis, hence the preferred model is fixed effects.

4
b) The variable “exper” was transformed into log format and the fixed effects regression
was run in Stata.
xtreg logwage logexper industry north city married female union educ black, fe
The differences from the previous fixed effects model are:
 R2 is stronger than before, 0.492. That is, the explanatory variables explain almost
50% of the variation in the wage.
 On the previous fixed effects model, for a 1% increase in logweeks the logwage varies by
5.2% keeping other variables constant. But in this model for a 1% increase in logexper,
the logwage varies by 89%.
 The estimated model is:
Logwage = 4.258 + 0.891logexper + 0.021industry - 0.001north - 0.082city – 0.053married
+ 0.026union.
For the random effects model running
xtreg logwage logexper industry north city married female union educ black, re
The differences from the previous random effects model are:
 R2 is less than the previous model.
 The coefficient of “logexper” =0.603. That for a 1% increase in experience, hourly
wage varies by 60% keeping other variables constant. While the coefficient of
“logweeks”= 0.068, meaning for a 1% increase in weeks, wage varies by 6.8%
keeping other variables constant. Wage is more correlated with Experience than
Weeks.
With the case of endogeneity issue in the panel data, we need to run the fixed effects regression with
endogenous variables “logexper”, “union” & “education” and instrumental variables “exper”,

5
“logweeks” and “weeks” in Stata. The results of this regression are shown on the below table and the
estimated model is:
Logwage = 3.189 + 1.283logexper - 0.118union + 0.022industry + 0.002north - 0.060city –
0.043married.
 R2 is 39%, which is less than the previous fixed effects model without endogeneity issue.
 The correlation between ui & Xb is not zero and the model is significant.
 As before, three time-invariant variables (edu, female & black) are not estimated.
 The coefficient of “experience” is significant and the highest of all. For a 1% increase in
experience, the hourly wage will change by 128% keeping other variables constant. It makes
sense.
 The coefficients of “union”, “industry”,” north” and “married” are insignificant.
 Since the majority of the coefficients of the variables are insignificant, personally I prefer the
previous fixed effects model than this one.

Similarly, the random effects regression with the above mentioned endogenous and instrumental
variables results in the values as shown on the below table.
The estimated model is:
Logwage = 21.725 + 0.136logexper + 3.595union – 1.274educ - 0.726industry - 0.282north
+ 0.463city – 0.178married + 0.176female – 2.512black.
In the resulting model, all the coefficients of the variables including the constant are insignificant.
Moreover, the negative coefficients of “educ”, “industry” and “married” donot give a realistic
meaning. In my opinion, this model isnot better than the previous one with out endogeneity issue.

6
c) Now with “logexper” and “union” as endogenous variables and taking “exper” and
“logweeks” as instrumental variables, the fixed effects regressions results are as shown on the
below tables.
xtivreg logwage industry north city educ married female black (logexper union = exper
weeks), fe
The estimated model is:
Logwage = 3.33 + 1.2981logexper – 0.687union + 0.047industry + 0.016north - 0.047city –
0.041married.
Compared with the previous model, this model’s R2 value is less than the previous model and
the rest is more or less similar.

7
And the random effects regression results are shown below:
xtivreg logwage industry north city educ married female black (logexper union = exper
weeks), re
The estimated model is:
Logwage = 6.434 + 0.531logexper – 2.353union + 0.129industry - 0.410north + 0.161city –
0.015married – 0.758female – 0.026educ + 0.030black.
When compared with the previous random effects model: this model has very low R2, in this
model the coefficient of “logexper” is significant, those in the union have lower hourly wage than
those who are not by 235% which doesn’t make sense. This model is preferred from the previous
random effects model.

8
Q#2
For the panel data model:

a) αi and Zi capture the unobserved effects that are specific to each individual and they are
constant over time. We can have both as a single variable:
δi = αi + Ziγ
Then the panel data model can be rewritten as:
yit = x’itβ + δi + εit ----------------------(a)
Since the number of n is large, it is not feasible to apply intercept dummy variables for each
individual. Instead, take the mean of the variables on both sides:
y i = x ’iβ + δi + ε i ----------------------(b)

Subtracting equation (b) from (a),


yit - y i = (x’it - x ’i ) β + (δi - δi ) + (εit - ε i)
~y = ~
x ’it β + ~ε it ----------------------(c)
it

Since δi is eliminated, β can be estimated from equation (c) with OLS and it is given by:

9
The assumptions of fixed effects model are:

 E(εit | Xi1, Xi2, ---, XiT) = 0


 The variables for one entity are distributed identically to, but independently of, the
variables for another entity. That is, (Xi1, Xi2, ---, XiT, εi1, εi2, ---, εit) are iid draws from
their joint distribution.
 Large outliers are unlikely, that is (Xit, εit) have nonzero finite fourth moments.
 There is no perfect multicollinearity.
b) For the random effects model, instead of treating δi as fixed we assume it to be random
variables with mean δ0 and random firm specific error term ui. That is,
δi = δ0+ ui
Then the panel data model will be:
yit = xitβ + δ0+ ui + εit
yit = δ0 + xitβ + ʋi ------------------------------------(d)
where
 δ0 is the average of all the individual’s intercepts.
 εit is the idiosyncratic iid error term.
 ui is the individual specific error term which measures the random deviation of each
individual from the common intercept, δ0.

The next step is to take the mean of equation (d) and subtract it from (d). That is;

yit - y i = (x’it - x ’i ) β + (δ0 – δ0 ) + (ʋit - ʋi)

yit - y i = (x’it - x ’i ) β + ʋit ------------------------------------(e)


then construct the RE GLS transformation equation by pre-multiplying the means by the
GLS parameter, λ. That is:
yit - λ y i = δ0(1- λ) + (x’it - λ x ’i )β + ʋit - λʋi
then β is estimated by OLS.
Where
 λ = 1-{σ2ε / (σ2ε + T σ2u}1/2
 σ2ε = variance of the idiosyncratic error term.
 σ2ε = variance of the firm specific error term.
The assumptions of random effects model are:
 Each ʋi ~N(0, σ2)

10
 Each ʋi’s are independent random variables.
 δi’s are independent random variables with distribution N(0, σ2δi)
 The ʋi’s and δi’s are independent of each other.
 The individual unobserved heterogeneity is uncorrelated with the independent variables.
That is cor(δi , x’it) = 0.

11

You might also like