# A Proposed Predictive Model for Student Population

## of Taguig City University

by: Meljun P. Cortes

Abstract
In this work, the researcher had derived the linear regression process and analysis of the given dataset from
university registrar and monitoring services of Taguig City University which are student population and number of
classroom as dataset of the proposed model, The allocation and assignment of classroom is one of the great
problem of a certain university as the population of student augmented. In this paper, the researcher used a
predictive analytics to analyze this main problem. Predictive Analytics is the branch of the advanced analytics which
is used to make prediction about unknown future events. Predictive Analytics uses many techniques from data
mining, statistics, modeling, machine learning and artificial intelligence to analyze current data to make predictions
about future. The patterns found in historical and transactional data can be used to identify risks and opportunities
for future. Regression Analysis is a predictive modeling technique. It estimates the relationship between a dependent
(target) and an indeoendent variable (predictor). The researcher used the “Linear Regression” ,when there is a linear
relationship between independent and dependent variables. The main goal of this study is to devise a model to
predict the number of classroom based on the given set of attributes of data which is student population using the
selected algorithm. The researcher has adopted Malthusian Theory of Population Growth where relate the increase
of student population as to the resources needed which is the number of classroom of Taguig City University.
Quantitative research method was used utilizing the analysis of data and documentary analysis approach. The
researcher adopted the simple linear regression model on which the dataset had inputted into the regression
equation. The equation that represents how an independent variable is related to a dependent variable and an error
term is a regression model. The researcher had adopted the confusion matrix and statistics to test the accuracy of
the proposed predicted model. To validate the model, the researcher had computed the sensitivity, specificity,
prevalence and P-value. The R studio of Anaconda Software used as a tools to determine all necessary statistical
measure. The researcher had implemented also the root mean square error to determine the predicted error. The
proposed predicted model had written in R code script or python programming script. This code in building the
linear regression model using the lm function. The accuracy or the overall success rate is a metric defining the rate at
which a model has classified the records correctly. A good model should have a high accuracy score. the proposed
predicted model has the accuracy of 71.13% means that the model has a high accuracy score. The P-value has a
value of 0.016053 which is lower than 5 means that the proposed predicted model has a better performance in terms
of predicting the estimates propability of number of classroom. The researcher had planned to level up in developing
a web-based application on which the proposed predictive model integrated using php and p;ython programming
language and that will be the future works.