Professional Documents
Culture Documents
Dummy Variables
Frequently in regression analysis we encounter factors that may well influence the
dependent variable but which we are unable to quantify in any meaningful way. Let us
suppose we were estimating a demand equation for alcoholic drink from quarterly data. We
might specify an equation of the form
Q = 1 2 E 3 P (1)
where Q is quantity demanded, E is the total real expenditure of consumers and P is the
relative price for alcohol. Since we are using quarterly data, we suddenly realize that, for
obvious reasons, expenditure on alcohol tends to be largest during the festive fourth quarter
of each year. But how do we allow for a non‐quantifiable factor such as this?
One way of allowing for qualitative factors is to codify them by assigning numerical
values to different circumstances. In the above case we could define a dummy variable D,
where D takes the value one during the fourth quarter of each year but the value zero during
all the other quarters. Instead of estimating (1), we could then estimate
Q = 1 1 D 2 E 3 P (2)
The variable D is treated just like any other variable in the equation – it just happens to take
the unusual values
0,0,0,1,0,0,0,1,0,0,0,1,0....
To interpret the parameter α1 in (2), note that for the fourth quarter, when D = 1, the
equation implies
Q = ( 1 1 ) 2 E 3 P (3)
However, during all the other quarters, when D = 0, (2) implies
Q = 1 2 E 3 P (3a)
Q ( 1 1 ) 2 E
Q 1 2 E
E
Figure 1. A parallel shift in the population regression line.
Thus α1 measures the change in the intercept parameter that occurs in the fourth quarter. A
positive value (as expected) measures, for given values of E and P, the increase in mean
alcohol expenditure that occurs during the fourth quarter. The situation is illustrated in
Figure 1, where we have abstracted from the price variable. Equation (2) implies a parallel
shift in the population regression equation during the fourth quarter.
Note that to estimate α1, we need estimate Equation (2) only, using data for all quarters. We
can test the hypotheses that α1 = 0 (i.e. whether there is a significant difference in demand
between the fourth quarter and the other quarters) by examining its confidence interval, t
statistic or p‐value in the usual way.
Source:
Thomas, R.L., Modern Econometrics – an introduction, Addison Wesley Longman, Harlow,
1997, pp. 260‐261