You are on page 1of 8

International Journal of Advances in Science and Technology,

Vol. 3, No.3, 2011


Fuzzy Modified Parametric Sample Selection
Models of Married Women : Case Study of
The MPFS-1994

Yaya Sudarya Triana
1
and Muhamad Safiih
2
1
Department of Mathematics, Faculty of Science and Technology, University Malaysia Terengganu, Malaysia
yst_2000@yahoo.com

2
Department of Mathematics, Faculty of Science and Technology, University Malaysia Terengganu, Malaysia
safiihmd@umt.edu.my
Abstract

Sample Selection Model which has been introduced by Heckman's (1979), only focused on the
issue of wages of married women for the participant, but nevertheless, this estimation model has
been widely used in various fields, however sometimes perform poorly. Several studies have been
conducted on the sample selection model to examine the wages of married women who work or
participant, until recently, no one is researching on wages for married women who do not work or
non-participant, when in fact married women who do not work proved to have wages. The data used
in this study originated from the survey was conducted by the National Population and Family
Development Board of Malaysia under the Ministry of Women, Family and Community
Development of Malaysia, called the Malaysian Population and Family Survey 1994 (MPFS, 1994).
The survey was conducted through a questionnaire, were randomly and specifically for married
women. The data set focus on married women which provides information on wages, educational
attainment, household composition and other socioeconomic characteristic. The Original sample
data based on Mroz (1987), there are 4444 records married women. This paper needs to consider the
models estimation using fuzzy modeling approach, called Fuzzy Parametric Sample Selection Model
(FPSSM). The fuzzy function used for solving uncertain of a parametric sample selection model
(PSSM). Estimates from the fuzzy are used to calculate some of equation of the sample selection
model. Finally, estimates of the mean, standard deviation (SD), root mean square error (RMSE).

Keywords: Econometric, Fuzzy Number, Heckman Two-Step Estimator, Married Women, Sample
Selection Model.


1. Introduction

The model of female labour participation and wage equations in labour economics were by
Gronau (1974) and Heckman (1976, 1979). The study analysis for the Malaysian Population and
Family Survey 1994 (MPFS-1994), the original sample data based on Mroz (1987). Data
collected from 4444 women heads of households through questioner. The data used in this study
originated from the survey was conducted by the National Population and Family Development
Board of Malaysia under the Ministry of Women, Family and Community Development of
Malaysia.
In this study, method of calculating Heckmans two-step estimator is compared using modify
Heckmans two-step estimator (1979) for different samples. It also needs to consider the
estimation of models with fuzzy modeling approach by sample selection model called Fuzzy
Parametric Sample Selection Model (FPSSM). The empiri cal results of the Monte Carlo
experiment in form of the Mean, Standard Deviation (SD), Root Mean-Square Error (RMSE).


September Issue Page 13 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011


2. The Model

The discussion of sample selection was started by Roy (1951) in the economic literature, Roy
(1951) discussed that the traditional econometric approach to the selection model adopts a more
conservative approach and allows for selection on unobservables. Several methods have been
proposed to cope with this problem. Later this method extended by Gronau (1974), Heckman
(1974), and Lewis (1974). Issues are discussed regarding the sample selection bias in the
context of the decision by women to participate in the labour force or not (Schafgans, 2000).
The most widely used procedure has been suggested by Heckman (1976, 1979) is a process that
describes the employment outcome is implemented and information from this is used, in the second
stage, to obtain consistent estimates of the relevant parameters. Heckman (1976) has proposed the
PSSM as follows :


i
*
i i
i
'
i
*
i
i
i 1
'
i 0
*
i
d Y Y
otherwise 0
0 v W d if 1
d
u X Y
=

> + o =
=
+ | + | =

(1)
where
*
i
Y ,
*
i
d
= dependent variables
X
i
,W
i
= independent variables or vector of exogenous variables

0
,
1
, = unknown parameter vectors
u
i
, v
i
= error terms

In Equation (1) above are error terms (u, v) which are usually correlated, so that will cause the
value estimation of
0
and
1
unsatisfactory or inconsistent as a result of the regression equation of the
dependent variable Y and independent variable X. To reduce this problem, the approach of the error
terms are assumed to follow a normal bivariate distribution.
According to Schafgans (1996), Markus (1998) and Martins (2001), there are two parts in equation
(1). The first part is participation equation (a binary decision equation). The second part is the wage
equation (outcome equation or selection part). Independent variable X
i
usually contain at least one
variable which does not appear in variable W
i
. In the outcome equation describes the relationship
between the dependent variable Y
i
and independent variable X
i
, whereas in the selection equation
describes the relationship between the dependent variable d
i
and the independent variable W
i
.


3. Fuzzy Modeling

In this research discusses the modified Parametric Sample Selection Models (PSSM) for non-
participant by applying a fuzzy concept. The fuzzy modeling is based on the concept of fuzzy sets.
There are three phases to discuss of FPSSM, namely fuzzification of parameters, the fuzzy
environment and defuzzification. In the fuzzification stage, the value of the input variables (crisp
values) is converted into fuzzy values input from the membership of fuzzy sets. The method used in
this study is an alpha-cut which applied to the triangular fuzzy number for the all observation. The
alpha-cut method starts from 0.2 to 0.8 with increase of 0.2. Then applied into the triangular
membership function. By using alpha-cut, the result will be obtained from (x, w, y) as follows:


), x , x , x ( x
~
iu im il i
= ) y , y , y ( y
~
iu im il
*
i
= and ) w , w , w ( w
~
iu im il i
=

(2)

September Issue Page 14 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011


The functions of membership are as below :

=
e

=
otherwise 0
] x , x [ x if
) x x (
) x x (
x x if 1
] x , x [ x if
) x x (
) x x (
) x (
iu im
im iu
iu
im
im il
il im
il
x
~
i

(3)

The same formula for
) y (
*
i
y
~

and
) w (
i
w
~
can be written like a formula for
) x (
x
~

To create the PSSM, then the first step is to change the membership (converting real-triangular
fuzzy membership values into a crisp value). A centroid method or the centre of gravity method is used
to calculate the outputs of the crisp value as the centre of area under the curve. The values of X
ic
, Y
ic
,
and W
ic
be the defuzzified values of , and respectively. The calculation of the centroid method
for X
ic
, Y
ic
, and W
ic
to formulas that are:


) X X X (
3
1
dx ) x (
dx ) x ( x
X
iu im il
i x
~
i x
~
ic
+ + =

=
}
}



(4)



4. Data Description and Variables Used
4.1 Data Description
The data used in this study originated from the survey was conducted by the National Population
and Family Development Board of Malaysia under the Ministry of Women, Family and Community
Development of Malaysia, called the Malaysian Population and Family Survey 1994 (MPFS, 1994).
The survey was conducted through a questionnaire, were randomly and specifically for married
women. The data set focus on married women which provides information on wages, educational
attainment, household composition and other socioeconomic characteristic.
The Original sample data based on Mroz (1987), there are 4444 records married women. Data used
for this study using the Malaysian Population and Family Survey 1994 (MPFS, 1994). Then the
selection data which considered, there is only married women with completed information.
Uncompleted information is grouped into a file invalid data and complete data is grouped into valid
data. Furthermore, the processed data is only valid data. After selection sample data, there are 1850
records of married women to be grouped into valid data. Then the valid data selected again to be
grouped into participant and non-participant data, selection of data based on Martin (2001). The criteria
of a sample choosing for participant and non-participant married women (MPFS-94), which are :
- Husband present in 1994
- Not in school or retired
- Married and aged below 60
- Husband reported positive earning for 1994.
After selection, there are 62.65% or 1159 persons are classified as participating of married women and
37.35% or 691 persons are classified as non-participating of married women.

4.2 Fuzzy Variable Used
From the data set of the MPFS-1994 which contains a nonparticipating data, there are some
exogenous variables using fuzzy concepts, rules Martin (2001) is used as follows:
September Issue Page 15 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011

E G
~
A (in years) is married womens age divided by 10
2 E G
~
A is age squared divided by 100
EDU is educational levels, measured in years of schooling
CHILD is the number of children younger than 18 living in the family
W
~
H is the log of the husbands monthly wage (measure in Ringgit Malaysia)
P X
~
PE is potential work experience, defined as age minus 6 minus years of schooling
2 P X
~
PE is potential work experience squared
CHD P
~
PEX is potential work experience times the number of children
2 CHD P
~
PEX is potential work experience squared, times the number of children
W W
~
H or Y
~
is the log womens hourly wage rate (measured in Ringgit Malaysia)
d is the indicator of labour market non-participation


4.3 Endogenous Variables

The category of non-participant in the labour market included individuals either self -
employed (family business or farming) or exclusively engaged in non-market home production
(Schafgans, 1996). The highest number of married women participants and non-participants in
the labour market were Malay: 616 (22.1%) and 1735 (62.1%), respectively, Chinese: 353
(12.6%) and 717 (25.7%), respectively, Indian: 107 (3.8%) and 242 (8.7%) respectively and the
other races were 24 (0.9%) and 98 (3.6%) respectively.
The first dependent variable, non-participant, is a dichotomous indicator that equal 1 if an
individual is a non-participant and 0 if not. The second dependent variable is the log of hourly
wages-(HW) in the wage equation. In Malaysia remuneration, other than basic wages, for
instance allowance, bonus, etc, are an important part of total earning (Mazumdar, 1991).
Schafgans (1996), therefore, mention that bonuses and payments in-kind (for instance food,
housing, etc) are included in the computational of hourly wages.


4.4 Fuzzy Exogenous Variables

Exogenous variables are the variables that enter into the second group, which contained both
in the nonparticipation and outcome equations. For example, the variable EDU appear in the
nonparticipation and outcome equations, whereas, the variables AGE and potential experiences
only appears in the equal nonparticipation and outcome equations, respectively. Details of the
exogenous variables are as follows:

E G
~
A
The average age of married women non-participants are 33.72 years old, while the age of married
women participants are 33.24 years old. This result is in accordance with Schafgans (1996) in which
women participants on average younger than women non-participant women. These results indicated
that it is consistent with the importance of increasing wage sector in Malaysia, particularly among the
younger educated individuals.

EDU
EDU is the educational levels. This variable measures the school year are required to obtain the highest
grade completed. Its measured by continuous variables. There is no measure available regarding the
actual year is needed for each individual to achieve the level completed (Schafgans, 1996). The
average edu of married women non-participants are 7.35, while the edu of married women participants
are 10.96.


September Issue Page 16 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011


P X
~
PE
The average PEXP for non-participants of married women are 23,85 and while the PEXP of
married women participants are 15.43 years


5. Fuzzy Participation Equation and Fuzzy Wage Equation

According to Greene (1997) model consist of two equation. The first equation is called the
participation equation. i.e. the probability of married women non-participating in the labour market.
The independent variables consist of E G
~
A (age in years divided by 10), 2 E G
~
A (age squared divided
by 100), EDU (years of education), CHILD(the number of children under 18 living in the family), and
W
~
H (log of monthly husbands wage). The dependent variable in the participant equation is a dummy
variable that takes the value 1 if the woman non-participate and zero otherwise. Wages is determined
by standard human capital approach, The potential experience (given by age-edu-6) available in
dataset. Buchinsky (1998) is solution to deal with this problems. The wage equation which is :



c + | + | + | + | + | + | = W H
~
CHILD EDU 2 E G
~
A E G
~
A z
5 4 3 2 1 0 i
(5)


The second equation is called the wage equation. The explanatory variables used in the wage equation
are as follows : EDU, XP E
~
P , 2 P X
~
PE , CHD P
~
PEX and 2 CHD P
~
PEX
The fuzzy wage equation as below :


2 CHD P
~
PEX CHD P
~
PEX 2 P X
~
PE P X
~
PE z
4 3 2 1 0 j
| + | + | + | + | = (6)


where

d
j
= 1 (z
j
0)

The independent variables are EDU, P X
~
PE (potential experience divided by 10), 2 P X
~
PE (potential
experience squared divided by 100), CHD P
~
PEX ( P X
~
PE interacted with the total number of children)
and 2 CHD P
~
PEX ( 2 P X
~
PE interacted with the total number of children). The dependent variable used
for the analysis was the long hourly wages (z).

If Participate = 1, the outcome equation is observed, as follow :

c + | + | + | + | + | = 2 CHD P
~
PEX CHD P
~
PEX 2 P X
~
PE P X
~
PE ) Y ( Ln
4 3 2 1 0
(7)


In this study are EDU and CHILD. Assumed data used in this study contained uncertainty, instead of
crisp data, therefore data are more appropriate. In the participation equation, fuzzy data used for the
independent variables (x). For the outcome equation the fuzzy data that was used for dependent
variables was the log hourly wage (z).






September Issue Page 17 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011


6. Result Simulation
6.1 The Fuzzy Participation Equation in the Labour Market

Results are shown in Table 1 is the fuzzy participant equation of FPSSM with the -cuts of
triangular fuzzy number with a value of 0.2, 0.4, 0.6, and 0.8. and parameters measured is coefficient
and SD. The first column shows Variables are
, E G
~
A , 2 E G
~
A
CHILD, EDU,
, W
~
H
and CONTS, while
the second column shows the parameters Coef, SD, column 3, 4, 5, and 6 shows the -cuts of triangular
fuzzy number with a value of 0.2, 0.4, 0.6, and 0.8.

Table 1. Fuzzy Participation Equation

Variables Parameters
Participant Fuzzy Participation Equation
Equation = 0.2 = 0.4 = 0.6 = 0.8
E G
~
A
Coef. 0.42953 0.31812 0.26852 0.23963 0.21151
SD 0.09580 0.07780 0.06780 0.05260 0.03570
RMSE 0.00223 0.00181 0.00158 0.00122 0.00083
2 E G
~
A
Coef. -0.06976 -0.05469 -0.04803 -0.04427 -0.04078
SD 0.01370 0.01140 0.01010 0.00830 0.00640
RMSE 0.00032 0.00026 0.00024 0.00019 0.00015
EDU
Coef. 0.01480 0.00570 0.00422 0.00285 0.00153
SD 0.00260 0.00100 0.00070 0.00050 0.00020
RMSE 0.00006 0.00002 0.00002 0.00001 0.00001
CHILD
Coef. -0.03407 -0.01217 -0.00878 -0.00563 -0.00261
SD 0.00700 0.00270 0.00200 0.00130 0.00060
RMSE 0.00016 0.00006 0.00005 0.00003 0.00001
W
~
H
Coef. 0.01605 0.01646 0.01706 0.01703 0.01194
SD 0.01060 0.01010 0.00970 0.00870 0.00620
RMSE 0.00025 0.00023 0.00022 0.00020 0.00014


Table 1 above shows the result of fuzzy participation equations of FPSSM using -cut = 0.2, 0.4,
0.6, and 0.8 generated that there is a positive relationship between level of education and
nonparticipation of married women in the participation equation, that is equal to 0.00570, 0.00422,
0.00285, and 0.00153. This means that the increasing high level of education, the higher the income
value non-participation of married women in the labour market or 1 unit increase in the value of EDU
affect to increase income 0.00570, 0.00422, 0.00285, and 0.00153 accordance with the -cut = 0.2, 0.4,
0.6, and 0.8, respectively. So it can be said that women with higher EDU has a higher chance of
obtaining income women in the nonparticipation of married woman for fuzzy participation equations of
FPSSM. Besides variable of EDU, the variables that have more positive relationships are variables
E G
~
A and . W
~
H



6.2 The Fuzzy Wage Equation in the Labour Market

Table 2 is the fuzzy wage equation of FPSSM with the -cuts of triangular fuzzy number with a
value of 0.2, 0.4, 0.6, and 0.8. and parameters measured is coefficient and SD. The first column shows
Variables are EDU, , P X
~
PE PEXP2, , CHD P
~
PEX , 2 CHD P
~
PEX and CONTS, while the second
column shows the parameters Coefficient, SD, column 3, 4, 5, and 6 shows the -cuts of triangular
fuzzy number with a value of 0.2, 0.4, 0.6, and 0.8.

September Issue Page 18 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011

Table 2. Fuzzy Wage Equation

Variables Parameter
Wage Fuzzy Wage Equation
Equation = 0.2 = 0.4 = 0.6 = 0.8
EDU
Coef. 0.00926 0.00336 0.00251 0.00214 0.00147
SD 0.00290 0.00110 0.00080 0.00050 0.00020
RMSE 0.00007 0.00003 0.00002 0.00001 0.00001
P X
~
PE
Coef. 0.18050 0.08075 0.05838 0.08287 0.09514
SD 0.04330 0.04310 0.04310 0.04310 0.04310
RMSE 0.00101 0.00100 0.00100 0.00100 0.00100
2 P X
~
PE
Coef. -5.25350 -3.69920 -3.17940 -3.43110 -3.08890
SD 1.10220 1.08050 1.07700 1.07030 1.05720
RMSE 0.02564 0.02513 0.02505 0.02490 0.02459
CHD P
~
PEX
Coef. -0.02008 -0.01186 -0.00938 -0.00983 -0.00359
SD 0.01180 0.01160 0.01160 0.01160 0.01160
RMSE 0.00027 0.00027 0.00027 0.00027 0.00027
2 CHD P
~
PEX
Coef. 0.00189 0.00346 0.00278 0.00306 0.00111
SD 0.00420 0.00380 0.00380 0.00380 0.00380
RMSE 0.00010 0.00009 0.00009 0.00009 0.00009

Result from Table 2 above shows that there is a positive relationship among variables EDU,
, P X
~
PE 2 CHD P
~
PEX and income non-participation of married women. As an example for EDU, an
increase of 1 unit EDU will affect income non-participation of married women as follow : 0.00336,
0.00251, 0.00214, and 0.00147 for -cut = 0.2, 0.4, 0.6, and 0.8, respectively. So it can be said that
income non-participation of married women tended to increase with variables of EDU, PEXP
(potential work experience, defined as age minus 6 minus years of schooling) and PEXPCHD2
(potential work experience times the number of children).

7. Conclusion and Discussion

From the results obtained provide information that the factors which affect married women for non-
participation earning wages in the labour market is the EDU, this means that wages tend to increase as
affected by EDU, other variables that affect the income of non-participation of married women in the
participation equation is the variable AGE and W
~
H ( W
~
H is the log of the husband's monthly wage).
While the variables that affect the wage income of non-participation of married women in the wage
equation is the EDU, P X
~
PE (is potential work experience, defined as age minus 6 minus years of
schooling), and 2 CHD P
~
PEX

( 2 CHD P
~
PEX is potential work experience squared, times the number of
children).


8. References

[1] Amemiya, T. , Tobit Models: A Survey, Journal of Econometrics, 24, p. 3-61, 1984.
[2] Buchinsky, M., The Dynamics of Changes in the Female Wage Distribution in the USA: A
Quantile Regression Approach, Journal of Applied Econometrics, 13. p. 1-30, 1998.
[3] Greene, W., A General Approach to Incorporating Selectivity in a Model, Department of
Economics, Stern School of Business, New York University, 2006.
[4] Greene, W., Sample selection in credit-scoring models. Japan and the World Economy, 10, 299-
316, 1998.
September Issue Page 19 of 105 ISSN 2229 5216

International Journal of Advances in Science and Technology,
Vol. 3, No.3, 2011

[5] Greene, W. H., Econometric Analysis. Pearson Education, Inc., Fifth Edition, 2003.
[6] Gronau, R., Wage comparisons: A selectivity bias, Journal of Political Economy, 82, p. 1119-
1143, 1974.
[7] Heckman, J.J., Shadow price, market wages and labor supply, Econometrics, 42, p. 679-694,
1974.
[8] Heckman, J.J., The Common Structure of Statistical Models of Truncation, Sample Selection, and
Limited Dependent Variables, and a Simple Estimation for such Models, Annals of Economic
and Social Measurement, 5, 475-492, 1976.
[9] Heckman, J.J., Sample selection as a specification error, Econometrica, Vol.47, p.153-161,
1979.
[10] L. Muhamad Safiih, A.A.Basah Kamil, M. T. Abu Osman, Fuzzy Semi-parametric Sample
Selection Model Case Study for Participation of Married Women, WSEAS Transactions on
Mathematics, 7, 2008.
[11] Lee, L.F., Unionism and relative wage rates: A simultaneous equation model with qualitative and
limited dependent variables, International Economic Review, 19, p. 415-433, 1986.
[12] Lewis, H.G., Comments on selectivity biases in wage comparisons, Journal of Political
Economy, 82, p. 1145-1155, 1974.
[13] Maddala, G.S., Limited-dependent and qualitative in econometrics, Cambridge University Press.
p. 257-289, 1983.
[14] Maddala, G.S., Econometric methods and applications, Volume II, England, Edward Elgar,
1994.
[15] Mroz, T. A., The Sensitivity of an Empirical Model of Married Womens Hours of Work to
Economic and Statistical Assumptions, Stanford University London. UK, 1984.
[16] Mroz, T. A., The sensitivity of an empirical model of married women's hours of work to
econometric and statistical assumptions, Econometrica, 55, 765-799, 1987.
[17] Nawata, K, A Note on the Estimation of Models With Sample-Selection Biases, Economics
Letters 42, 1524, 1993.
[18] Nawata, K., Estimation of Sample Selection Bias Models by Maximum Likelihood Estimator and
Heckmans Two-Step estimator, Econometrics Letters, 45, 33-40, 1994.
[19] Nawata,K, Estimation of sample selection bias Models, 387-400, 1996.
[20] Nawata K, Estimation of the Female Labor Supply Models by Heckmans Two-Step Estimator
and the Maximum Likelihood Estimator, Mathematics and computers in simulation 64:385-392,
2004.
[21] Nelson, F. D., Efficiency of The Two-step Estimator for Models with Endogenous Sample
Selection, Journal of Econometrics, 24, 181-196, 1984.
[22] Newey, W., Two step series estimation of sample selection models, Department of Economic,
MIT working paper no. E52 - 262D. p. 1-17, 1988.
[23] Paarsch, H. J., A Monte Carlo Comparison of Estimators for Censored Regression Models,
Journal of Econometrics, 24, 197-213, 1984.
[24] Vella, F., Estimating models with sample selection bias: A survey, Journal of Human Resource,
Vol. 33, p. 127-169, 1998.
[25] Wales, T. J., Woodland, A. D., Sample selectivity and the estimation of labor supply functions,
International Economic Review, 21, 437-468, 1980.
[26] Zadeh, L.A., On the analysis of large-scale system. In: Gottinger, H., ed., Systems approaches
and Environment Problems. Vandenhoeck and Ruprecht, 1974.

Authors Profile

Yaya Sudarya Triana is currently Ph.D student in Econometric, University Malaysia Terengganu. He received
Master degree in Information Technology, University of Indonesia, Jakarta-Indonesia and Undergraduate in
Statistics, University of Padjadjaran, Bandung-Indonesia.


Dr. Muhamad Safiih bin Lola received his Ph.D degree in Econometric, University Science Malaysia. He
received Master degree in Applied Statistics, University Putra Malaysia, and Undergraduate in Economics,
Northern University of Malaysia.
September Issue Page 20 of 105 ISSN 2229 5216

You might also like