You are on page 1of 4

Homework: the General Linear Model

We are interested in knowing whether persons having a loyalty card spend


more in the supermarket or not. Take the data in the Eviews file “glshomework.wf1”.
We have (artificial) data for 2 supermarkets, and for each of them we have, for
20 clients, the amount spent (AMOUNT), the size of the household the per-
son belongs to (HHS), and a binary variable (CARD) indicating whether the
person owns a loyalty card or not. Denote yij the amount spent by customer
i in supermarket j. (These are not panel data, since the customers in the 2
supermarkets are different), and xij the personal characteristics of the client (in
this case, only consisting of HHS). Te proposed model is:
yij = β t xij + αj + δCARD + εij
where αj , for j = 1, 2 are fixed effects for each supermarket. Of main interest
is to know whether δ is significant or not.

1. Estimate the parameters of the above model by OLS. Interpret briefly the
parameter estimates. (Hint: since the β and δ parameters are supposed to
be the same for the 2 supermarkets, it will be necessary to pool the data.
You will also need to create yourself the appropriate dummy variables
STORE1 and STORE2 to take the fixed effects into account.)
Solution
Estimating the model with OLS yields the following estimates:

Coefficient Std. Error t-Statistic Prob.


HHS (β) 31.01160 1.969869 15.74298 0.0000
CARD (δ) 9.433054 4.905733 1.922863 0.0624
STORE1 (α1 ) 11.23020 8.110628 1.384628 0.1747
STORE2 (α2 ) -2.138706 7.939190 -0.269386 0.7892

As can be seen, the only variable that has a truly significant effect on
the amount spent is the size of the customer’s household, where each
additional member yields an average increase of 31.01 currency units on
the amount spent (given that all other variables remain fixed).
However, the possession of a loyalty card seems to have an effect as well,
which, although not significant, should not be dismissed out of hand (p =
0.0624). It appears that customers with a loyalty card tend to spend, on
average, 9.43 currency units more than customers without a loyalty card
(assuming the other variables are the same). ¤
2. We are afraid that there is groupwise heteroscedasticity in the error terms,
i.e. Var(εij ) = σj2 for j = 1, 2.

1
(a) Estimate the variances σ12 and σ22 using the OLS-residuals.
(b) Estimate now the parameters by GLS. Write down an expression for
Σ, the covariance matrix of the error terms, and show that GLS boils
down to Weighted Least Squares (WLS) here. Create the series of
weights to be used, and carry out the WLS estimation (in Eviews,
take estimation method LS with Weighted LS).
(c) What is the advantage of a WLS instead of OLS?
Solution
(a) From the OLS-residuals it is found that

σ̂12 = 93.052
σ̂22 = 327.002

(b) From these estimates of the groupwise residual variances, an expres-


sion for Σ̂ is found,
µ 2 ¶
σ̂1 I20 0
Σ̂ =
0 σ̂22 I20

where In is the n × n identity matrix. From this, the GLS-estimator


can be computed.

θ̂GLS = (X t Σ̂−1 X)(X t Σ̂−1 Y )

where
 
HHS1,1 CARD1,1 STORE11,1 STORE21,1
 .. .. .. .. 
 . . . . 
 
 HHS20,1 CARD20,1 STORE120,1 STORE220,1 
X=
 HHS1,2

 CARD1,2 STORE11,2 STORE21,2 
 .. .. .. .. 
 . . . . 
HHS20,2 CARD20,2 STORE120,2 STORE220,2

and Y = (y1,1 , . . . , y20,1 , y1,2 , . . . , y20,2 )t , θ = (β, δ, α1 , α2 )t . Working


out the expression for the estimator, it is found that, due to the form
of Σ̂−1 ,

θ̂GLS = (X t Σ̂−1 X)−1 (X t Σ̂−1 Y )


à µ ¶−1 !−1
t σ̂12 I20 0
= X X
0 σ̂22 I20
à µ ¶−1 !
2
σ̂ I 20 0
× Xt 1
Y
0 σ̂22 I20
à à 1 ! !−1
σ̂ 2 I20 0
= Xt 1
1 X
0 I
σ̂22 20
à à 1 ! !
t
I
σ̂12 20
0
× X 1 Y
0 I
σ̂ 2 20
2

2
à 20 20
!−1
X 1 X 1
t t
= Xi,1,· Xi,1,· + Xi,2,· 2 Xi,2,·
i=1
σ̂12 i=1
σ̂2
à 20 20
!
X 1 X 1
t t
× Xi,1,· 2 yi,1 + Xi,2,· 2 yi,2
i=1
σ̂1 i=1
σ̂2
 −1  
X X 20 X X 20
=  t
wi,j Xi,j,· Xi,j,·   t
wi,j Xi,j,· yi,j 
j=1,2 i=1 j=1,2 i=1

where wi,1 = σ̂12 and wi,2 = σ̂12 for i = 1, . . . , 20. As such, the
1 2
GLS problem has been reduced to a WLS problem. Using Eviews to
estimate the parameters yields the following:
Coefficient Std. Error t-Statistic Prob.
HHS (β) 31.08443 1.533161 20.27473 0.0000
CARD (δ) 17.62158 3.730573 4.723558 0.0000
STORE1 (α1 ) 7.305034 5.943001 1.229183 0.2270
STORE2 (α2 ) -6.056593 6.676884 -0.907099 0.3704
The size of the household is still a very significant variable, but,
as can be seen, the possession of a loyalty card has also become
a very significant influencing factor on the amount of money spent
by a customer, with a customer having such a card spending, on
average, 17.62 currency units more than a customer without loyalty
card (assuming the other variables are the same).
(c) OLS assigned the same weight to the observations of supermarket 1
as to the observations of supermarket 2, whereas the observations of
supermarket 2 are less useful, due to the higher variance, than those
of supermarket 1. This caused the effect of owning a loyalty card to
be partially masked. Hence, WLS is more efficient.

¤
3. We are afraid that there might be interaction between the variable CARD
and the supermarket. In particular, the effect of the loyalty card might
differ among different supermarkets. The model becomes now

yij = β t xij + αj + δj CARD + εij

(a) Estimate the above model by OLS. Do you think there might be
interaction? (Hint: creating the variables STORE1 × CARD and
STORE2 × CARD might be useful)
(b) Test whether the interaction is significant or not.

3
Solution
(a) Estimating the model by OLS yields the following output:
Coefficient Std. Error t-Statistic Prob.
HHS 29.64042 1.736519 17.06887 0.0000
STORE1*CARD 23.89327 5.767522 4.142727 0.0002
STORE2*CARD -6.827699 6.110978 -1.117284 0.2715
STORE1 9.247987 7.004234 1.320342 0.1953
STORE2 9.566397 7.538886 1.268940 0.2128
The household size is still a highly significant variable, and owning
a loyalty card still has a significant effect on the amount purchased,
but only for supermarket 1! For supermarket 2, there is only a non-
significant, negative effect. From this, the conclusion can be drawn
that there is an interaction between the supermarket and owning a
loyalty card for the amount of money expended.
(b) Testing whether the effect is significant or not amounts to performing
a Wald test for:
H0 : δ 1 = δ 2
The test returns a Chi-square value of 13.55827, and a p-value of
0.000231. As such, there is a strongly significant interaction between
loyalty card and supermarket for the effect on amount spent.
¤

4. Economists would say that there is a serious endogeneity problem here.


There is probably existing a feed-back relation from AMOUNT to CARD.
Could you explain why this might be the case? Explain in words why
it might indeed be that the error terms are correlated with the variable
CARD.
Solution
Most likely, the loyalty card will mostly be promoted to customers who
spend much but don’t have a card yet. Since for those big spenders, the
benefits of the card are most clear, they will be more inclined to get one
for themselves.
Vice versa, for people already owning a card, the benefits of it are higher
if they spend more, so they might be inclined to naturally spend more
than a person without a card, leading to a correlation between owning a
loyalty card (CARD) and the error term in the model. ¤

Remark: because the number of observations were the same for both groups,
this problem could also have been solved like a panel data problem. However,
this is NOT true in general, as different groups will contain a different number
of observations.

You might also like