You are on page 1of 7

Predictive Analysis of the Enrollment of Schools Using

Regression Algorithms

I. Introduction
 Support Vector Machine (SVM)

During the Covid19 outbreak, Learning Modality (LM) was implemented in both public
and private schools in the Philippines. The Department of Education (DepEd) has established
Learning Delivery Modalities (LDM) for their clients. As part of this, parents and children are
also given access to synchronous and asynchronous learning modes[1]. Because of the LDM
deployment, learners and teachers have been able to absorb knowledge in a modular format.
Cloud Computing (CC), a concept of information management[2], is used to collect the majority
of learning resources. The learning materials are still being evaluated to see if the modules are
measuring students' understanding of a subject successfully. Furthermore, learners' progress must
be assessed at all grade levels and across domains.

In the presence of the COVID-19, the LDM devised a solution for continuing the
learners' education. Students in elementary, junior high, and senior high school delivered weekly
modules to the learners. Other schools have prepared for both synchronous and asynchronous
learning models, depending on the system selected at the time of enrolment. The program's
execution faces two major obstacles: no computer or smartphone, and some participants are
experiencing internet connectivity issues. For example, an excessive or insufficient quantity of
printed learning modules, broken modules as a result of wear and tear, and so on.

A model will be used to analyze the process of distributing learning resources and the
resources required by the various schools in DepEd Region 4A[3], which consists of 21
Divisions. From the 2016-2017 academic year through the 2019-2020 academic year, the study
will focus on projection for the elementary, junior, and senior high school education
departments, with coverage ranging from Kindergarten to Grade 12. A Support Vector Machine
prediction was used to evaluate each institution's performance and success rate with cloud-based
learning materials[4]. In addition, determining the acceptability of cloud-based learning by
examining the trends of various data obtained from various sources.

Various prediction models, such as Nave Bayes, Gradient Boosted Tree, Random Forest,
and others[5], have been used in various studies; however, these models may not be useful in
predicting primary school enrollment in Region 4A, where estimating the enrollment pattern will
be more difficult due to various trends and pandemics. The research will be more accurate by
specifying the numerous factors and the best-fit forecasting algorithm..
the leading predictive algorithms is Support Vector Machine (SVM)[7] is one of the most
widely used predictive algorithms; it has been used in a number of applications, including
medical, statistical, and environmental forecasting, as well as enrollment analysis[8][9]. SVM
has also been found to work well with multidimensional datasets[10]. In this instance, using this
strategy would allow for a larger scope and more accuracy[11]. Support Vector Machine is a
prominent Supervised Learning technique that may be used to solve both classification and
regression tasks.

Support vector machines (SVM) [12] is a widely adopted classifier which has been found
highly effective for a variety of pattern recognition problems. Based on a labeled training set, it
determines a hyperplane that linearly separates two classes in a higher-dimensional kernel space.

There are five sections to this study. Data Preparation is discussed in Section 2. Section 3
gives a quick rundown of the most prevalent approaches for evaluating categorization models.
Section 4 presents and analyzes experimental results in relation to model results and discussion.
Section 5 concludes with a recommendation.

II. Data Preparation


The data were gathered in the different parts of the Philippines, Antipolo, Binan, Cavite,
Dasmarinas, Quezon City, Rizal, and Tanauan. The gathered data will be use to predict next
academic year by using rapid miner.
III. Method

The Predictive Analysis of the Enrollment of Schools Using Regression


Algorithms assessment approach will be performed in phases. The first stage is data
collection, the dataset is collected and combined with varieties of data. Next is data cleaning,
removing unnecessary data, and preparing the important data needed for predicting the
model. Next is splitting, the given data will be split into two parts, the first one is the training
dataset and the second is the testing dataset. The next step is to train the data using the
regression algorithms assessment approach. The final stage is building the model and
assessing the performance based on the dataset. In conclusion, there are different techniques
that will be used to assess the dataset to know if the model is the perfect fit for it for instance

Methodology

Data cleaning increased the dataset's quality by utilizing correlation to find the relevant
attributes, eliminating duplicate entries, and organizing data. The end result will be a dataset
ready for further training and testing.
IV. Result and Discussion

Linear regression must dictate that all values must be set to numerical
counterparts evaluated to predict the total enrollee for the academic year

For training and testing datasets with random values, the data was separated into 70
percent and 30 percent coefficients. The table below shows the outcome of the forecast with
the average number of values. The average values indicate the dataset's most accurate
forecast.

MINIMUM MAXIMUM AVERAGE


TOTAL 0 3100 3100
PREDICTION 4 82.88 43.44

Table 1. THE STATISTIC AND AVERAGE OF THE VALUE AND PREDICTED VALUES.

The range of values for the total and prediction are closely connected, but there is a
substantial difference in the range of values for the minimum and maximum values. The exact
figures comprise a prognosis that differs from the previous three academic years; as a result, the
prediction follows the pattern of the entering kindergarten's assumption based on the trend.
Gender is one of the most important factors to consider when making a prediction. The
statistical value of each parameter determines the viability of prediction; consequently, the
dataset's strength will reveal the link between each predictive value shown above.

MODEL ABSOLUTE RELATIVE TRAINING SCORING


ERROR ERROR TIME TIME
DEEP LEARNING 34.062 57.4% 429 MS 16 MS
DECISION TREE 34.369 56.8% 3 MS 3 MS
GENERALIZE 35.119 57.9% 8 MS 3 MS
LINEAR MODEL

RANDOM 36.213 59% 20 MS 20 MS


FOREST

GRADIENT 33.994 57.3% 69 MS 69 MS


BOOSTED TREES

SUPPORT 31.604 54.6% 1 MS 1 MS


VECTOR
MACHINE

Table. 2: COMPARATIVE RESULT

V. Conclusion
With the data we gathered after we run it in data miner to predict the total number of
student next school year, and we proved that SVM (Support Vector Machine) is the best model
to use in predicting data such as this. SVM achieved being the best in performance over all other
5 models.
Reference:

[1] M. M. Shahabadi and M. Uplane, “Synchronous and Asynchronous e-learning Styles and
Academic Performance of e-learners,” Procedia - Social and Behavioral Sciences, vol. 176,
pp. 129–138, 2015, doi: 10.1016/j.sbspro.2015.01.453.

[2] M. Chamilco, A. Pacheco, C. Peñaranda, E. Felix, and M. Ruiz, “Materials and methods
on digital enrollment system for educational institutions,” Materials Today: Proceedings, no.
xxxx, pp. 2–6, 2021, doi: 10.1016/j.matpr.2021.04.213.

[3] E. Jimenez and Y. Sawada, “Public for private: The relationship between public and
private school enrollment in the Philippines,” Economics of Education Review, vol. 20, no.
4, pp. 389–399, 2001, doi: 10.1016/S0272-7757(00)00061-3.

[4] P. Singh and Y. P. Huang, “A new hybrid time series forecasting model based on the
neutrosophic set and quantum optimization algorithm,” Computers in Industry, vol. 111, pp.
121–139, 2019, doi: 10.1016/j.compind.2019.06.004.

[5] M. D. Hernandez, A. C. Fajardo, and R. P. Medina, “A Hybrid Convolutional Neural


Network-Gradient Boosted Classifier for Vehicle Classification,” IJRTE Journal, no. 2, pp.
213–216, 2019, doi: 10.35940/ijrte.B1016.078219.

[6] R. Bozick, D. M. Anderson, and L. Daugherty, “Patterns and predictors of postsecondary


re-enrollment in the acquisition of stackable credentials,” Social Science Research, vol. 98,
no. April, p. 102573, 2021, doi: 10.1016/j.ssresearch.2021.102573.

[7] Ahmad Slim , Don Hush , Tushar Ojah , Terry Babbitt “ Predicting Student Enrollment
Based on Student and College Characteristics ,’’

[8] V. Vamitha, “A different approach on fuzzy time series forecasting model,” Materials
Today: Proceedings, vol. 37, no. Part 2, pp. 125–128, 2020, doi:
10.1016/j.matpr.2020.04.579.

[9] M. dela Cruz, “of State Universities and Colleges in Central Luzon Philippines :,” 2019.
[10] A. Bender et al., “Dataset for multidimensional assessment to incentivise
decentralised energy investments in Sub-Saharan Africa,” Data in Brief, vol. 37, p. 107265,
2021, doi: 10.1016/j.dib.2021.107265.
[11] M. D. Hernandez, A. C. Fajardo, R. P. Medina, J. T. Hernandez, and R. M.
Dellosa, “Implementation of data augmentation in convolutional neural network and gradient
boosted classifier for vehicle classification,” International Journal of Scientific and
Technology Research, vol. 8, no. 12, pp. 185–189, 2019, [Online]. Available:
http://www.ijstr.org/final-print/dec2019/Implementation-Of-Data-Augmentation-In-
Convolutional-Neural-Network-And-Gradient-Boosted-Classifier-For-Vehicle-
Classification.pdf

[12] Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–
297 (1995)

You might also like