You are on page 1of 22

ECO 8101: ECONOMETRICS

Session 6

Dr. Janaka Fernando


Ph.D. (Nagoya), Mecon (Colombo) Master of IDS (GRIPS), B.Sc. (USJ)
(pjsampath@sjp.ac.lk /0714191996)
Senior Lecturer
Department of Business Economics
Faculty of Management Studies and Commerce
University of Sri Jayewardenepura
Lecture Outline
• Dummy Variable Regression

09/20/2022 Department of Business Economics 2


ANOVA Model
Suppose we are interested of the following model
salaryi   o  1 genderi  ui
Where, gender= 0 for male and gender=1 for female
. reg salary gender

Source SS df MS Number of obs = 100


F(1, 98) = 14.99
Model 1.0991e+10 1 1.0991e+10 Prob > F = 0.0002
Residual 7.1858e+10 98 733243513 R-squared = 0.1327
Adj R-squared = 0.1238
Total 8.2849e+10 99 836861449 Root MSE = 27078

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

gender 21599.28 5578.744 3.87 0.000 10528.44 32670.12


_cons 76188.95 4392.708 17.34 0.000 67471.76 84906.13

09/20/2022 Department of Business Economics 3


The estimated model for male (gender=0) is as follows
^
𝑆𝑎𝑙𝑎𝑟𝑦 =76189
Y  o
The estimated model for female (gender=1) is as follows

= 97788
Y   o  1
The estimated mean salaries of two groups are as follows
. table gender , c(mean salary n salary )

Gender mean(salary) N(salary)

0 76188.9 38
1 97788.2 62
09/20/2022 Department of Business Economics 4
Things to Remember on Dummy
Variables
 If a qualitative variable has m categories, introduce only (m − 1) dummy variables. In other
words, For each qualitative regressor, the number of dummy variables introduced must
be one less than the categories of that variable. If you do not follow this rule, you will fall
into what is called the dummy variable trap, that is, the situation of perfect collinearity or
perfect multicollinearity.

 The category for which no dummy variable is assigned is known as the base, benchmark,
control, comparison, reference, or omitted category. And all comparisons are made in
relation to the benchmark category.

 The intercept value represents the mean value of the benchmark category. The coefficients
attached to the dummy variables are known as the differential intercept coefficients
because they tell by how much the value of the category that receives the value of 1
differs from the intercept coefficient of the benchmark category. If a qualitative variable
has more than one category, the choice of the benchmark category is strictly up to the
researcher. 5
ANOVA Models with Two Qualitative Variables
(gender and region)
Salaryi   o  1 genderi   2 R2i   3 R3i   4 R4i   5 R5i  ui

If gender = 1, female If R2 = 1, Jaffna If R3 = 1, Galle If R4 = 1, Kandy If R5 = 1, Colombo


If gender = 0, other (male) If R2 = 0, Other If R3 = 0, Other If R4 = 0, Other If R5 = 0, Other

For Kurunegala, Male Y  o

For Kurunegala, Female Y   o  1

Y   o  1   5
For Colombo, Female
Y  o  5
For Colombo, Male
6
. reg salary gender i.Region

Source SS df MS Number of obs = 100


F(5, 94) = 5.15
Model 1.7815e+10 5 3.5631e+09 Prob > F = 0.0003
Residual 6.5034e+10 94 691851393 R-squared = 0.2150
Adj R-squared = 0.1733
Total 8.2849e+10 99 836861449 Root MSE = 26303

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

gender 20955.34 5540.185 3.78 0.000 9955.176 31955.51

Region
2 5148.403 8424.404 0.61 0.543 -11578.45 21875.25
3 1618.949 7293.588 0.22 0.825 -12862.64 16100.54
4 10747.27 8016.466 1.34 0.183 -5169.611 26664.15
5 29455.45 10345.99 2.85 0.005 8913.244 49997.66

_cons 70319.36 6438.386 10.92 0.000 57535.79 83102.92

09/20/2022 Department of Business Economics 7


The ANCOVA Models

𝑆𝑎𝑙𝑎𝑟𝑦 𝑖 = 𝛽0 + 𝛽1 𝐸𝑑𝑢 𝑖 + 𝛽 2 𝐺𝑒𝑛𝑑𝑒𝑟 𝑖 +𝑢𝑖

 Where, =Years of schooling;


=1 if female
=0 if male

09/20/2022 Department of Business Economics 8


. reg salary education gender

Source SS df MS Number of obs = 100


F(2, 97) = 9.16
Model 1.3158e+10 2 6.5788e+09 Prob > F = 0.0002
Residual 6.9692e+10 97 718471294 R-squared = 0.1588
Adj R-squared = 0.1415
Total 8.2849e+10 99 836861449 Root MSE = 26804

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

education 2907.563 1674.517 1.74 0.086 -415.8908 6231.016


gender 22031.22 5527.863 3.99 0.000 11059.94 33002.49
_cons 31580.81 26056 1.21 0.228 -20133.14 83294.76

The estimated model for male (gender=0) is as follows


^
𝑆𝑎𝑙𝑎𝑟𝑦 =31581 +2908 ∗ Edu
The estimated model for female (gender=1) is as follows
^
𝑆𝑎𝑙𝑎𝑟𝑦 =31581+2908 ∗ 𝐸𝑑𝑢+22031∗ 𝐺𝑒𝑛𝑑𝑒𝑟 ❑
09/20/2022 Department of Business Economics 9
Dummy Variable Regression to test Structural Changes

Copied for education purpose from https://en.wikipedia.org/wiki/Apparel_industry_of_Sri_Lanka#/media/


File:Per_Capita_Apparel_Exports_to_US.JPG .
09/20/2022 Department of Business Economics 10
 The dummy variable regression can be used to test structural changes of
an economy. There are four possibilities regarding structural changes.

1. Both the intercept and the slope coefficients are the same in the two
regressions. This, the case of coincident regressions.

2. Only the intercepts in the two regressions are different but the slopes are the
same. This is the case of parallel regressions

3. The intercepts in the two regressions are the same, but the slopes are
different. This is the situation of concurrent regressions

4. Both the intercepts and slopes in the two regressions are different. This is
the case of dissimilar regressions
11
12
Suppose you are interested of the following function
Yi  1   2 D1i  1 X 1i   2 ( D1i X 1i )  ui
Y = Consumption
X = Gross domestic product
D =Dummy variable
If D =1, the observation is belonging to period before 1977
If D = 0, the observation is belonging to period after 1977

Mean consumption function for the period before 1977


Y   1   2  1 X   2 X Y  1   2  ( 1   2 ) X

Mean consumption function for the period after 1977


Y  1  1 X
13
Dependent Variable: S
Method: Least Squares
Date: 03/01/17 Time: 00:58
Sample: 1 54
Included observations: 54

Variable Coefficient Std. Error t-Statistic Prob.

C -4705.938 7011.666 -0.671158 0.5052


D01 4566.985 15886.26 0.287480 0.7749
GDP 0.169402 0.002789 60.72956 0.0000
DGDP -0.026945 0.859227 -0.031359 0.9751

R-squared 0.988892 Mean dependent var 168229.8


Adjusted R-squared 0.988225 S.D. dependent var 300477.5
S.E. of regression 32605.02 Akaike info criterion 23.69351
Sum squared resid 5.32E+10 Schwarz criterion 23.84084
Log likelihood -635.7247 Hannan-Quinn criter. 23.75033
F-statistic 1483.742 Durbin-Watson stat 2.538888
Prob(F-statistic) 0.000000
14
Interaction Effects Using Dummy Variables

Yi  1   2 D1i   3 D2i  1 X i  ui

Yi  Salary; D1  1 If Female, 0 Otherwise; D2  1 If Sinhalese 0 Otherwise; X i  Education

 Implicit in this model is the assumption that the differential effect of the gender
dummy D1 is constant across the two categories of race and the differential effect of
the race dummy D2 is also constant across the two sexes.

 That is to say, if the mean salary is higher for males than for females, this is so
whether they are Sinhalese or not. Likewise, if, say, Sinhalese has lower mean wages,
this is so whether they are females or males.

 Is this assumption is realistic? No. Because there may be interaction effects


15
Dummy Variables without Interaction
. reg salary education gender sinhalese

Source SS df MS Number of obs = 100


F(3, 96) = 6.24
Model 1.3526e+10 3 4.5087e+09 Prob > F = 0.0006
Residual 6.9323e+10 96 722117291 R-squared = 0.1633
Adj R-squared = 0.1371
Total 8.2849e+10 99 836861449 Root MSE = 26872

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

education 2857.965 1680.196 1.70 0.092 -477.1985 6193.128


gender 21961.86 5542.722 3.96 0.000 10959.64 32964.07
sinhalese -3843.562 5380.779 -0.71 0.477 -14524.32 6837.2
_cons 34263.53 26390.63 1.30 0.197 -18121.46 86648.53

16
Interaction Effects Using Dummy
Variables
Yi  1   2 D1i   3 D2i   4 D1i D2i  1 X i  ui

 2  differenti al effect of being a female


 3  differenti al effect of being a sinh alese
 4  differenti al effect of being a female sinh alese

17
Dummy Variables with Interaction
. reg salary education gender sinhalese gender#sinhalese
note: 1.gender#0b.sinhalese omitted because of collinearity
note: 1.gender#1.sinhalese omitted because of collinearity

Source SS df MS Number of obs = 100


F(4, 95) = 4.68
Model 1.3630e+10 4 3.4075e+09 Prob > F = 0.0017
Residual 6.9219e+10 95 728623223 R-squared = 0.1645
Adj R-squared = 0.1293
Total 8.2849e+10 99 836861449 Root MSE = 26993

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

education 2878.399 1688.614 1.70 0.092 -473.9235 6230.722


gender 24042.19 7829.648 3.07 0.003 8498.375 39586.01
sinhalese -5440.208 6860.368 -0.79 0.430 -19059.76 8179.344

gender#sinhalese
0 1 4206.103 11130.19 0.38 0.706 -17890.12 26302.32
1 0 0 (omitted)
1 1 0 (omitted)

_cons 32645.29 26852.89 1.22 0.227 -20664.43 85955.01 18


H0: salary of a female bsc holders is same as the salary of a female non-bsc
holder
Example: Without an interactive dummy

Yi ( gender  1; bsc  1)  1   2 D1i ( gender)   3 D2i (bsc)  1 X i (exp)  ui

Example: With an interactive dummy


Yi ( gender  1; bsc  1)  1   2 D1i ( gender)   3 D2i (bsc)   4 D1i D2i  1 X i (exp)  ui

09/20/2022 Department of Business Economics 19


𝑆𝑎𝑙𝑎𝑟𝑦 𝑖 = 𝛽0 + 𝛽1 𝐸𝑑𝑢 𝑖 + 𝛽 2 𝐸𝑥𝑝 𝑖 + 𝛽3 𝐺𝑒𝑛𝑑𝑒𝑟 𝑖 +𝑢𝑖
. reg salary education experience gender

Source SS df MS Number of obs = 100


F(3, 96) = 72.41
Model 5.7458e+10 3 1.9153e+10 Prob > F = 0.0000
Residual 2.5391e+10 96 264490477 R-squared = 0.6935
Adj R-squared = 0.6839
Total 8.2849e+10 99 836861449 Root MSE = 16263

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

education 2120.369 1017.81 2.08 0.040 100.0309 4140.706


experience 4102.383 316.9832 12.94 0.000 3473.176 4731.59
gender 1792.397 3700.614 0.48 0.629 -5553.264 9138.059
_cons -5894.441 16072.13 -0.37 0.715 -37797.37 26008.49

09/20/2022 Department of Business Economics 20


Seasonal dummy variables

09/20/2022 Department of Business Economics 21


Thank You

09/20/2022 Department of Business Economics 22

You might also like