A study presented to
Ms. Angela D. Nalica
Professor, Stat 136
Danan, Rustico IV
Ingking, Eugene Creely
Vitug, Marianne
April 2015
I. ABSTRACT
This paper aims to explain the factors affecting the variability of the enrolment rate in
basic education of the Philippines. Using SAS, the enrolment was regressed on 15 variables that
have been initially considered as indicators in the enrolment rate. In coming up with the best
model, the researchers have detected problems with outliers. Remedial measures such as deletion
and transformation of variables were performed to verify the compliance of the model with the
regression assumptions. The variables that showed a positive linear relationship with enrolment
rate are percentage of teachers, Total Financial Resources, Percentage of Fully Immunized
children (912 months) and Percentage of private motor vehicles while Cohort Survival Rate
showed a negative linear relationship. The resulting model had an R 2 value of 0.9544 where all
five variables are significant predictors of the enrolment rate in basic education.
II. INTRODUCTION
The Philippine has committed itself to 8 Millennium Development Goals (MDG) by
2015. One of which is to provide universal access to primary education. The government
acknowledges the potential of basic education as an empowering process that capacitates the
individual to function, to achieve, to be lawabiding, participate intelligently in elections, and
have a better sense of nation and community (Sen, 1999).
However, according to the Philippine Midterm Progress report on MDG that was released last
2007, the government assessed the probability of attaining 100% enrolment rate in primary
institution to be low. Also, according to UNDP, from a near universal rate of 97% in 1999/2000,
as measured by net enrolment rate (NER), participation in elementary education dropped to
83.2% in 2006/07, the lowest over the last two decades. While increasing marginally in 2007 at
84.8%, the rate of progress is very minimal in order to achieve universal access to basic
education by 2015. . On the other hand, as released by UNICEF last 2005, the net enrollment rate of
secondary education reached 60.7% from 66.1%.
Although the compulsory minimum education law is already enforced, there are still
other factors that affect the said rate. According to a study conducted by Tullao and Rivera
(2009), it suggests that the factors affecting enrolment rate in basic education can be divided into
Supply and Demand factors. Supply factors include the capability of the government to cater
2
pupils and students, the number of teachers, school facilities, number of books, school supplies
and other educational inputs (Tullao, Rivera, 2009). Moreover, the demand factors refer to the
households decision to avail educational services. This can be measured through household
income, cost of education, and health factors. Looking at the provincial level, these factors can
be measured through total financial resource, the number of private cars and number of fully
immunized children. It is also important to note the Cohort Survival Rate per province as one of
the indicators that might affect the enrolment rate.
To address the problems in enrolment rate, the researchers aim to provide a model that
will determine and explain the significant factors that explains the variability. Also, the
researchers aim to relate this model to the recently implemented K12 educational system.
III. RELATED LITERATURE
Last September 2000, Philippine government is one of the 189 member states that
committed to achieve 8 Millennium Development Goals (MDG) by 2015. One of these goals is
to achieve universal access to primary education.
The Education plays a significant in economic development. It is widely acknowledged
that education increases the innovative capacity of an economy and facilitates the diffusion,
adoption, and adaptation of new ideas. More specifically, education increases the amount of
human capital available, thereby increasing productivity and ultimately output. Education is
especially important in a rapidly evolving economic environment where a rapid rate of job
destruction and creation might otherwise lead to a gap between the skills demanded in the labor
market and the skills of jobseekers (Yap, 2012).
The Department of Education receives one of the biggest budget allocations among other
sectors and agencies in the Philippines. This allows the government to create more schools,
increase the number of teachers, books and improve educational facilities.
Basing from the Philippine Net enrolment ratio in primary education for both sexes, during
1990s, the rate was consistently high (going up to 97%) then there was a decrease in year 2001,
with only 92.7%. The lowest rate for the last two decades was reported to be 83.2% last 2007. In
the case of the secondary education, last year 2000, the enrolment rate is 52.6% that eventually
increased to 60% last 2005.
Although the budget allocation to basic education remains to be the biggest, and
compulsory education law is already enforced, family decision still affects the ability of the
children to go to school (King, 1983). The family demand for human capital can be attributed to
different economic and demographic factors (Tullao, Rivera, 2009).
As cited by Caete (n.d.), Maligalig and Albert (2008) presented in their analysis of the
2002 and 2004 APIS the different reasons for nonattendance in school of schoolage children as
follows: (a) Cannot cope with school work; (b) High cost of education; (c) Illness and/or
disability; (d) Lack of personal interest; (e) Schools are far or there is no school within the
barangay; (f) Finished schooling; (g) Housekeeping chores; (h) No regular transportation from
house to school; and others. Poor health was also pointed out to be one of the factors that
negatively affect the school participation is the poor health of pupils.
Currently, different programs are being implemented to remedy the problems in the
enrolment rate. One of the programs is called Pantawid Pamilyang Pilipino Program. The aim
of 4P is to provide human development capital to eradicate intergenerational poverty cycles by
granting poor families that complied to different conditions. These conditions include sending
children to school and keeping them healthy.
Also, the government launched the K+12 program to better ensure the quality of the
primary education here in the Philippines. This program also aims to prepare every Filipino
citizens to be more globally competitive, in the hope of eradicating, if not lessening, poverty.
Governance
Health
Financial ResourcesSafe Water Supply
Indexed CrimeNumber of Sanitary Toilet
Fully Immunized
Children
Others
Telephone Line Subscribers
Private Motor Vehicles
Number of Energized Barangays
Family Income (INCOME) average family annual income obtained by dividing the
total income of families by the total number of families. It includes primary income and
receipts from other sources received by all family members during the calendar year as
participants in any economic activity or as recipients of transfers, pensions, grants, etc.
Consumer Price Index (CPI) indicator of the change in the average prices of a fixed
basket of goods and services commonly purchased by households relative to a base year
Poverty Incidence of Families (POV)  proportion of families whose annual per capita
income fall below the annual per capita poverty threshold
Private Bank Deposit (BANKDEP) total private commercial bank deposits in million
pesos
Education
Cohort Survival Rate (COHORT)  percentage of enrolees at the beginning grade or year
in a given school year who reached the final grade or year of the elementary/secondary
level
Governance
Percentage of Indexed Crime (CRIME)  percentage of the total index crime with respect
to the total number of crime
Health
Safe Water Supply (WATER)  number of households with access to clean water
Number of Sanitary Toilet (TOILET)  number of sanitary toilet in the province. Sanitary
toilet is a covered installation, whether public or private, used for the disposal of waste.
missing cells were deleted prior to importing in the statistical package. A remaining total of 72
observations (provinces), 15 independent variables were organized in a spreadsheet and were
included in the development of the model. The full model used in the study to determine which
predictor variables significantly explain the variation in the response variable, the enrolment rate
of public elementary and secondary schools (ENROL), is given below:
DF
Model
Error
Corrected Total
15
56
Sum of
Squares
Mean
Square
F Value
8941.42420
596.09495
467.71078
8.35198
71
9409.13498
Pr > F
71.37
<.0001
Root MSE
2.88998 RSquare
0.9503
Dependent Mean
27.03254 Adj RSq
0.9370
Coeff Var
10.69074
Parameter Estimates
Variable
Intercept
INCOME
BANKDEP
CPI
CAR
ENERG
TEACH
PHONE
FINRES
WATER
TOILET
CRIME
COHORT
PUBE
IMU
POV
Label
Parameter
Standard
DF
Estimate
Error
t Value
Pr > t
Intercept
1
25.91442
13.71744
1.89
0.0641
INCOME
1
0.00001823
0.00001427
1.28
0.2069
BANKDEP
1
0.00000598
0.00003190
0.19
0.8521
CPI
1
0.19485
0.09621
2.03
0.0476
CAR
1
0.08053
0.04363
1.85
0.0702
ENERG
1
0.00090835
0.00167
0.54
0.5893
TEACH
1
24.45408
2.48649
9.83
<.0001
PHONE
1
0.01559
0.05540
0.28
0.7794
FINRES
1
0.00222
0.00096923
2.29
0.0260
WATER
1
0.00000786
0.00000664
1.18
0.2412
TOILET
1 0.00000804
0.00000635
1.27
0.2103
CRIME
1
0.01338
0.04273
0.31
0.7553
COHORT
1
0.20181
0.04034
5.00
<.0001
PUBE
1
0.00255
0.00209
1.22
0.2272
IMU
1
2.28500
0.54931
4.16
0.0001
POV
1
0.09932
0.05785
1.72
0.0916
Since the F Value is relatively high, then this suggests that there is at least one
independent variable in the model that can explain the variability of the percentage of enrolment.
Also, 95.03% can be explained by the model as seen in the value of R2.
9
Based on the independent variables pvalues using .05 as alpha, the significant variables
are the following: CPI, TEACH, FINRES, COHORT and IMU.
The researchers used Stepwise Variable Selection Procedure to identify the independent
variables that will be included in the reduced model.
TEACH
IMU
COHORT
FINRES
CAR
F Value Pr > F
TEACH
1
0.8559 0.8559 94.3347 415.79 <.0001
IMU
2
0.0424 0.8983 48.5626 28.77 <.0001
COHORT
3
0.0207 0.9190 27.2116 17.41 <.0001
FINRES
4
0.0148 0.9338 12.5716 14.95 0.0003
CAR
5
0.0075 0.9413 6.1276
8.43 0.0050
Using stepwise selection, the variables that are suggested to be included in the model are
TEACH, IMU, COHORT, FINRES and CAR.
The researchers regressed the percentage of enrolment to the variables suggested by the
variable selection procedure.
DF
Model
Sum of
Squares
8856.83842
Mean
Square
F Value
1771.36768
Pr > F
211.68
<.0001
10
Error
Corrected Total
66
552.29656
8.36813
71
9409.13498
Root MSE
2.89277 RSquare
0.9413
Dependent Mean
27.03254 Adj RSq
0.9369
Coeff Var
10.70107
Parameter Estimates
Variable
Label
Parameter
Standard
DF
Estimate
Error
t Value
Pr > t
Intercept Intercept
1
2.93143
3.73332
0.79
0.4351
TEACH
TEACH
1
24.15975
2.21674
10.90
<.0001
IMU
IMU
1
2.25222
0.45582
4.94
<.0001
COHORT
COHORT
1
0.17085
0.03125
5.47
<.0001
FINRES
FINRES
1
0.00193
0.00059288
3.26
0.0018
CAR
CAR
1
0.11010
0.03793
2.90
0.0050
Based on the R2, 94.13 % of the variability of the percentage of enrolment can be
explained by the reduced model. The full model R2 decreased by around 1% which is very
minimal; implying that the unselected variables didnt contribute much on explaining the
variability of the dependent variable.
The reduced model has to undergo diagnostic checking in order for the researchers to
make sure that the assumptions are not violated.
MULTICOLLINEARITY
Figure 4. SAS Output for Variance Inflation, Condition Index and Proportion of Variation
Parameter Estimates
Parameter
Variable Estimate
Standard
Variance
Condition
Error t Value Pr > t
Inflation Eigenvalue
Index
Intercept 2.93143
3.73332
0.79
0.4351
0
5.54181
1.00000
TEACH
24.15975
2.21674
10.90
<.0001
3.87897
0.24075
4.79779
IMU
2.25222
0.45582
4.94
<.0001
4.00322
0.17396
5.64413
COHORT
0.17085
0.03125
5.47
<.0001
1.05637
0.02142 16.08425
FINRES
0.00193 0.00059288
3.26
0.0018
1.27644
0.01668 18.22808
CAR
0.11010
0.03793
2.90
0.0050
1.11116
0.00538 32.09414
Collinearity Diagnostics
Proportion of VariationNumber
Intercept
TEACH
IMU
COHORT
FINRES
1
2
3
4
5
6
0.00026347
0.00001404
0.00695
0.00321
0.02427
0.96530
0.00102
0.02649
0.01349
0.74283
0.19045
0.02573
0.00135
0.02469
0.08191
0.76458
0.08359
0.04389
0.00065372
0.00020031
0.02342
0.10107
0.56590
0.30876
0.00554
0.59191
0.19319
0.13962
0.01940
0.05033
CAR
0.00040177
6.397094E7
0.00874
0.02502
0.28631
0.67953
11
Based on the variance inflation, none of the variables are suspected to have
multicollinearity problem since all of the VIFs < 10. However, one value of the condition index
is greater than 30. The researchers then checked the proportion of variation for possible
multicollinear variables. Although CAR has a proportion of variation greater than 0.5, there is no
other variable within the same line that has proportion of variation greater than 0.5. From these
observations, the researchers concluded that there is no multicollinearity between the
independent variables in the reduced model.
NONLINEARITY
Figure 5. SAS Output for Partial Regression Plot
The residual plot shows that the points are randomly scattered in a horizontal band with a
few outliers. From this the assumption of linearity is verified.
NONNORMALITY
Figure 6. SAS Output for Statistics of Tests for Normality
Tests for Normality
Test
Statistic
p Value
ShapiroWilk
W
0.99338 Pr < W
0.9716
KolmogorovSmirnov D
0.039828 Pr > D
>0.1500
Cramervon Mises
WSq 0.014527 Pr > WSq >0.2500
12
AndersonDarling
ASq 0.135491
Since the Pvalues for KolmogorovSmirnov, Cramervon Mises and Anderson Darling
are all greater than .05, there is no sufficient evidence to conclude that the error terms are not
normally distributed. The Shapiro Wilk Test proves that the input data values comprise a random
sample from a normal distribution since 1P(Pr<W) = .0284 is less than .05 and its Test Statistic
W is very close to 1.
HETEROSKEDASTICITY
Figure 7. SAS Output for Partial Residual Plots
13
14
Since all residual plots are neither diamond nor funnel shape, then heteroskedasticity
might not be present among all independent variables in the reduced model. To make sure, the
researchers used Whites Test for homoskedasticity.
Figure 8. SAS Output for Whites Test for homoskedasticity
Heteroscedasticity Test
Equation
Test
Statistic
ENROL
White's Test
DF
31.17
Pr > ChiSq
20
Variables
0.0530
Based from the Pvalue which is greater than .05 then there is no sufficient evidence to
conclude that the variances of the error terms are not constant.
OUTLIERS/INFLUENTIAL OBSERVATIONS
Figure 10. SAS Outputs for Detection of Outliers
Dependent Variable: ENROL ENROL
Analysis of Variance
Source
DF
Model
Error
Corrected Total
5
66
Sum of
Squares
Mean
Square
F Value
8856.83842
1771.36768
552.29656
8.36813
71
9409.13498
Pr > F
211.68
<.0001
Root MSE
2.89277 RSquare
0.9413
Dependent Mean
27.03254 Adj RSq
0.9369
Coeff Var
10.70107
Parameter Estimates
Variable
Label
Parameter
Standard
DF
Estimate
Error
t Value
Pr > t
Intercept Intercept
1
2.93143
3.73332
0.79
0.4351
TEACH
TEACH
1
24.15975
2.21674
10.90
<.0001
IMU
IMU
1
2.25222
0.45582
4.94
<.0001
COHORT
COHORT
1
0.17085
0.03125
5.47
<.0001
FINRES
FINRES
1
0.00193
0.00059288
3.26
0.0018
CAR
CAR
1
0.11010
0.03793
2.90
0.0050
Output Statistics(ONLY INCLUDES OUTLIERS)
Dependent Predicted Std Error
Std Error Student
Cook's
Obs Variable
Value Mean Predict Residual Residual Residual 21 0 1 2
16 22.9975 17.2267
0.6124
5.7708
25 23.9297 29.5490
0.8367 5.6193
2.827
2.041 
**** 
0.033
0.063
15
32 22.2067 28.7400
1.0012 6.5333
0.131
36 35.8303 30.1003
0.9708
5.7300
2.725
62 79.7616 72.7027
1.5985
7.0589
2.411
2.103 
**** 
0.094
2.928 
***** 
0.628
Output Statistics
Hat Diag
Cov
Obs RStudent
H
Ratio
DFFITS
16 2.0928
0.0448
0.7759
0.4533
25 2.0798
0.0837
0.8125
0.6284
32 2.5013
0.1198
0.7174
0.9227
36
2.1604
0.1126
0.8147
0.7696
62
3.1148
0.3053
0.6847
2.0651
Output Statistics
DFBETASObs Intercept
TEACH
IMU
COHORT
FINRES
16
CAR
0.0899
0.0453
0.0072
0.2235
0.1421
0.2035
25 0.1846
0.3317
0.2997
0.4061
0.1730
0.1984
32
0.0267
0.7176
0.8223
0.2144
0.1703
0.0415
36 0.0484
0.0316
0.0953
0.4754
0.1588
0.4897
62 0.5000
0.2111
0.8460
0.1875
0.5257
0.1095
From this SAS output, the observations with absolute values of student residuals greater
than 2 are 16, 25, 32, 36 and 62. The 62nd observation has the highest Cooks D. Meaning it has
the highest influence among the observations. For the DFFITS, only observation no. 62 exceed
the cutoff. No observation exceeds the cutoff for the DFBETAS.
The researchers deleted the 62nd observation since it has the highest Cooks D. After
deletion, the outlier with the highest Cooks D is:
Dependent Predicted Std Error
Std Error Student
Cook's
Obs Variable
Value Mean Predict Residual Residual Residual 21 0 1 2
D
32 22.2067 28.0501
0.9668 5.8434
2.541 2.299  ****
 0.128
1.0075 5.4690
0.146
0.144
***** 
0.167
0.8707 6.0869
0.8319
6.2343
2.277
2.738 
16
1.1739 4.7195
0.341
After deleting observation 46, no other observations was deleted since no other value of Cooks
D is significant.
Figure 11. SAS Output for Tests of Normality after outliers have been removed.
Tests for Normality
Test
Statistic
p Value
ShapiroWilk
W
0.984211 Pr < W
0.5656
KolmogorovSmirnov D
0.092826 Pr > D
>0.1500
Cramervon Mises
WSq 0.062069 Pr > WSq >0.2500
AndersonDarling
ASq 0.385189 Pr > ASq >0.2500
Heteroscedasticity Test
Statistic
DF Pr > ChiSq
Equation
Test
ENROL
White's Test
17.66
20
0.6100
Variables
Cross of all vars
With this, the researchers concluded that this new data set still generates error terms that
are normally distributed and have constant variances. Also, the residuals are spread randomly in
a horizontal band.
17
AUTOCORRELATION
The last problem to be diagnosed is autocorrelation. This is of minimum priority since we
are only dealing with 2009 data. However, it will still help validate the assumption that the error
terms are uncorrelated.
Figure 12. SAS Output for Test of Autocorrelation
Dependent Variable: ENROL ENROL
DurbinWatson D
2.023
Number of Observations
66
1st Order Autocorrelation
0.033
Covariance
Correlation
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
4.396547
0.146008
0.436768
0.225990
0.304449
0.879969
0.685755
0.011037
0.064890
0.535145
0.318076
0.133435
0.084637
0.640056
0.112981
0.139338
0.041377
1.00000
.03321
.09934
0.05140
.06925
0.20015
.15598
0.00251
0.01476
.12172
0.07235
0.03035
0.01925
.14558
.02570
.03169
0.00941
1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1

















Std Error
********************
0
. * .

0.123091
. ** .

0.123227
. * .

0.124435
. * .

0.124756
. ****.

0.125337
. *** .

0.130090
.  .

0.132893
.  .

0.132894
. ** .

0.132919
. * .

0.134597
. * .

0.135185
.  .

0.135288
. *** .

0.135329
. *
.

0.137682
. *
.

0.137754
.

.

0.137865
Correlation
1
2
3
0.13727
0.19128
0.13227
1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1



. *** .
. ****.
. *** .



18
4
5
6
7
8
9
10
11
12
13
0.10259
0.19103
0.17183
0.01574
0.02744
0.11791
0.08636
0.00655
0.03968
0.16357










. ** .
.**** .
. *** .
.  .
. * .
. ** .
. ** .
.  .
. * .
. *** .









Based on the output, autocorrelation is not significant even up to the first order. R2 did
not change and the corrected parameter estimates have very small difference with the previous
estimates. All the variables are still significant.
VI. CONCLUSION
The estimated regression function will be
^
ENROL
= 6.67878 + 18.05543TEACH +
students who finished elementary and secondary level meaning those students will not enrol in
the next school year. The higher total financial resource can give higher budget for education and
better material resources that can improve the percentage of enrolment. Those people who owns
a private motor vehicles indicates that they have the financial capability of sending their child to
school. Owning a private motor vehicles can also resolve proximity barriers caused by absence
of nearby public schools.
Also, more support from the government is essential in improving the percentage of
enrolment since the total financial resource is a significant factor. It is also recommended for the
government to increase the number of health care services offered to children and give more
support to the teachers by providing benefits such as increasing their salary.
20
VII. REFERENCES
King, E.M. and Lillard L. Determinants of schooling attainment and enrolment rates in the Philippines.
April 1983. The Rand Publication Series. PDF File.
Tullao, T. and Rivera J.P. Demographic, and other factors affecting school .participation among children
in urban and rural households: the case of Pasay and Eastern Samar. Vol. II, No. 6. PDF File.
Caete, L. Reviewing the Effects of Population Growth on Basic Education Development. PDF File
National Statistics Coordination Board. 2012 First Semester Official Provincial Poverty Statistics of the
Philippines. PDF File.
Capones, M. Report of the Philippine government on Millennium Development Goals. August 2008.
PDF File
UNICEF. Education statistics: Philippines. May 2008 Division of Policy and Practice, Statistics and
Monitoring Section. PDF File
National Economic and Development Authority. Philippine Midterm Progress Report on the Millennium
Development Goals. PDF File
Anonymous. (2010, April 17). Improving Philippine education. http://opinion.inquirer.net. Retrieved
March 30, 2014, from http://opinion.inquirer.net/inquireropinion/talkofthetown/view/20100417264867/ImprovingPhilippineeducation
Yap, Joseph (2012, August 6). OPINION: Improving The Quality Of Education In The Philippines.
http://www.asianscientist.com.
Retrieved
March
30,
2014,
from
http://www.asianscientist.com/academia/philippineseducationasiapacificjosefyappids2012/
21
VIII. APPENDICES
22
23