Professional Documents
Culture Documents
Session 6
= 97788
Y o 1
The estimated mean salaries of two groups are as follows
. table gender , c(mean salary n salary )
0 76188.9 38
1 97788.2 62
09/20/2022 Department of Business Economics 4
Things to Remember on Dummy
Variables
If a qualitative variable has m categories, introduce only (m − 1) dummy variables. In other
words, For each qualitative regressor, the number of dummy variables introduced must
be one less than the categories of that variable. If you do not follow this rule, you will fall
into what is called the dummy variable trap, that is, the situation of perfect collinearity or
perfect multicollinearity.
The category for which no dummy variable is assigned is known as the base, benchmark,
control, comparison, reference, or omitted category. And all comparisons are made in
relation to the benchmark category.
The intercept value represents the mean value of the benchmark category. The coefficients
attached to the dummy variables are known as the differential intercept coefficients
because they tell by how much the value of the category that receives the value of 1
differs from the intercept coefficient of the benchmark category. If a qualitative variable
has more than one category, the choice of the benchmark category is strictly up to the
researcher. 5
ANOVA Models with Two Qualitative Variables
(gender and region)
Salaryi o 1 genderi 2 R2i 3 R3i 4 R4i 5 R5i ui
Y o 1 5
For Colombo, Female
Y o 5
For Colombo, Male
6
. reg salary gender i.Region
Region
2 5148.403 8424.404 0.61 0.543 -11578.45 21875.25
3 1618.949 7293.588 0.22 0.825 -12862.64 16100.54
4 10747.27 8016.466 1.34 0.183 -5169.611 26664.15
5 29455.45 10345.99 2.85 0.005 8913.244 49997.66
1. Both the intercept and the slope coefficients are the same in the two
regressions. This, the case of coincident regressions.
2. Only the intercepts in the two regressions are different but the slopes are the
same. This is the case of parallel regressions
3. The intercepts in the two regressions are the same, but the slopes are
different. This is the situation of concurrent regressions
4. Both the intercepts and slopes in the two regressions are different. This is
the case of dissimilar regressions
11
12
Suppose you are interested of the following function
Yi 1 2 D1i 1 X 1i 2 ( D1i X 1i ) ui
Y = Consumption
X = Gross domestic product
D =Dummy variable
If D =1, the observation is belonging to period before 1977
If D = 0, the observation is belonging to period after 1977
Yi 1 2 D1i 3 D2i 1 X i ui
Implicit in this model is the assumption that the differential effect of the gender
dummy D1 is constant across the two categories of race and the differential effect of
the race dummy D2 is also constant across the two sexes.
That is to say, if the mean salary is higher for males than for females, this is so
whether they are Sinhalese or not. Likewise, if, say, Sinhalese has lower mean wages,
this is so whether they are females or males.
16
Interaction Effects Using Dummy
Variables
Yi 1 2 D1i 3 D2i 4 D1i D2i 1 X i ui
17
Dummy Variables with Interaction
. reg salary education gender sinhalese gender#sinhalese
note: 1.gender#0b.sinhalese omitted because of collinearity
note: 1.gender#1.sinhalese omitted because of collinearity
gender#sinhalese
0 1 4206.103 11130.19 0.38 0.706 -17890.12 26302.32
1 0 0 (omitted)
1 1 0 (omitted)