Professional Documents
Culture Documents
Econometric Model With Qualitative Variables
Econometric Model With Qualitative Variables
How to quantify qualitative variables to quantitative variables ? Why do we need to do this ? Econometric model needs quantitative variables to estimate its parameters
What are the differences among these variables: Dummy? Indicator? Binary? Dichotomy? Categorical
Other Usages:
How to model Unstable Regression? - Jumping Regression - Shifting Regression
Technically speaking, do we have problems with our model if: - Independent variable (s) is (are) a dummy (ies) - Dependent variables is a dummy
Illustration:
We would like to analyze whether there are differences between graduate and undergraduate students in weekly entertainment spending. Y: weekly spending for entertainment per student PS: graduate or undergraduate PS = 1 ; graduate student PS = 0 ; undergraduate student Model: Y = + PS + u From the model, an average spending: Graduate student: E (Y PS = 1) = + Undergraduate student: E (Y PS = 0) =
For example, by using data from a survey, the estimated model is the following: Y = 9,4 + 16 PS t (53,22) (6,245) R2 = 96,54% The model indicates that 0 dan 0 (statistically signifiant) Interpretation: average spending for graduate students: 9,4 + 16 = 25,4, average spending for under graduate students: 9,4 (There is a difference between spending of the two groups) The next question is whether graduate students more able or more consumptive in entertainment spending than undergraduate students
A model that can relate X and G to Y: Y = 1 + 2 G + X + u From the model, it can be seen that: Average salary of female professor = 1 + X Average salary of male professor = 1 + 2 + X
Secara geometris:
Y Gaji tahunan Dosen laki-laki Dosen perempuan 2 1 X Pengalaman mengajar
Katakanlah berdasarkan data didapat: Y = 19,21 + 0,373 G + 1,453 X t (11,33) (1,141) (37,997) R2 = 89,75% Adakah diskriminasi?
Since we define dummy variable differently, will we have different result substantively? Model with new definition:
Y = 1 + 2 S + X + u
Perlu diperhatikan sekarang bahwa berdasarkan pendefinisian baru: Rata-rata gaji dosen perempuan = 1 + 2 + X Rata-rata gaji dosen laki-laki = 1 + X
Remark
In defining dummy variable, which category is representing by one or zero does not matter as long as the estimated model is interpreted consistently.
Y = 1 + 2 D2 + 3 D3 + X + u
When we estimate this model with OLS, what will happened ?
Hubungan antar regresor: D 2 = 1 - D3 atau D3 = 1 - D2 Akibat: Perfect Collinear Aturan main: Jika jumlah kategori sebanyak m, maka kita hanya memerlukan m-1 variabel dummy.
Can we represent these types of variables with a Variable that has different values like: 1, 2, and 3 based on the number of categories? Should we define differently? Try define as follows: D2 = 1 ; if the highest level of education is high school 0 ; others D3 = 1 ; if the highest level of education is university 0 ; others
Pendidikan
D2
D3
0 0 1 0 0 1
0 1 0 0 0 0
Bagaimana memilih kelompok dasar? Pengeluaran Asuransi berdasarkan Tingkat Pendidikan dan Pengeluaran
Y S1 SMU Tidak tamat SMU 3 2 1 Pendapatan (X) Diasumsikan : 3 > 2
Rata-rata Gaji: Dosen P diluar FE = 7,43 + 1,226 = Rp.8,656 juta. Dosen L diluar FE=7,43+0,207+1,226 = Rp.8,863 juta. Dosen P di FE=7,43 +0,164 + 1,226 = Rp.8,820 juta. Dosen L di FE=7,43+0,207+0,164+1,226 =Rp.9,027 juta.
Wm = 1+ 2 Wu+ 3 Ras+ 4 Kota+ 5 SMU+ 6 Wilayah+ 7 Umur+ u Misalkan, berdasarkan suatu sampel, model terestimasi: Wm = 37,07 + 1,403 Wu - 90,06 Ras + 75,51 Kota + 47,33 SMU + 113,64 Wilayah + 2,26 Umur Apa artinya bila uji-F, dan uji-t, ternyata semua variabel signifikan pada tingkat signifikansi 5%. Rata-rata upah pekerja bukan pribumi di pedesaan KTI dan tidak lulus SMU: Wm = 37,07 + 1,403 Wu + 2,26 Umur Rata-rata upah pekerja pribumi di perkotaan KBIdan lulus SMU: Wm = (37,07-90,06+75,51+113,64+47,33) +1,403Wu + 2,26 Umur Wm = 183,49 +1,403Wu + 2,26 Umur
Comparing 2 regressions
Saving (Y) = 1 + 2 Income (X) + u The above model indicates that saving and income do not behave differently across sampel and time. However, in reality, there is a possibility that the model differs between before and after a certain event. Let say, behavior of saving is different between prior and post an economic crisis. How to accommodate this changing in saving behavior? The following model can be used in accommodating a change.
Periode I, before crisis:Yi = 1 + 2 Xi + ui ; i = 1,2, , n Periode II, after crisis:Yi = 1 + 2 Xi + i ; i = n+1, n+2, , N
Possibilities in comparing those two models: Case 1: 1 = 1 and Case 2: 1 1 and Case 3: 1 = 1 and Case 4: 1 1 and 2 = 2 2 = 2 2 2 2 2
Case 1 : both models are the same, no shift Case 4 : both models are different and there is a shift
Membandingkan 2 regresi dengan variabel dummy Mengantisipasi adanya pergeseran model regresi: Yi = 1 + 2 Di + 1 Xi + 2 Di Xi + ui Di = 1 ; pengamatan pada periode 1 0 ; pengamatan pada periode 2 Sehingga, rata-rata tabungan (Y) pada periode : I : Yi = (1 + 2) + (1 + 2) Xi II : Yi = 1 + 1 Xi
R a ta -ra ta k o m is i p e n ju a la n b ila m e la m p a u i ta rg e t : K o m is i = 1 + ( 1 + 2 ) X - 2 X * ; X * X S e h in g g a m o d e ln ya d a p a t d ig a b u n g m e n ja d i : Y = 1 + 1 X + 2 (X X *) D
S e c a ra g e o m e tris :
K o m is i
1 X* P e n ju a la n