You are on page 1of 103

Republic of the Philippines

Western Mindanao State University

College of Computing Studies

DEPARTMENT OF COMPUTER SCIENCE

Zamboanga City

Retention Prediction of Computer Students in Institute of Computer Studies using


Logistic Regression

A Thesis presented to the faculty of


Department of Computer Science
College of Computing Studies

In partial fulfillment of the requirements for the degree of


Bachelor of Science in Computer Science

Jelly Mae Q. Colao


Vhina S. Peniones
John Paul A. Biescas

Lucy Felix Sadiwa

Thesis Teacher

April 2022
Western Mindanao State University
College of Computing Studies
DEPARTMENT OF COMPUTER SCIENCE
Zamboanga City

Approval Sheet

The Thesis attached hereto, entitled “Retention Prediction of Computer Students in


Institute of Computer Studies using Logistic Regression”, prepared and submitted by Jelly
Mae Q. Colao, Vhina S. Peniones and John Paul A. Biescas, in partial fulfillment of the
requirements for the degree of Bachelor of Science in Computer Science, is hereby
recommended for Oral Examination.

Lucy Felix Sadiwa


Adviser

APPROVED by the Oral Examination Committee on _______________ with a rating of


PASSED.

<Panel 1>
Chairperson

<Panel 2> <Panel 3>


Member Member

ACCEPTED in partial fulfillment of the requirements for the degree of Bachelor of Science in
Computer Science

ODON A. MARAVILLAS JR, MSCS


Head, Department of Computer Science

RODERICK P. GO, Ph.D.


Dean, College of Computing Studies
ACKNOWLEDGEMENT

First and foremost, the Researchers are very grateful to our God Almighty for
without His graces and blessings this study would not have been possible.

We are blessed and thankful to our Adviser Ma’am Lucy Sadiwa and to all CCS
faculty members for the aspiring guidance and friendly advice during the project work.

We would also like to take this opportunity to thank our Family, Friends and
Classmates for the endless supports that they gave and showed us throughout this
project and implementation, without them this wouldn’t have been possible.

Endless sleepless nights are worth it, thank you to all the pips out there!!
Table of Contents

Chapter 1.....................................................................................................................................................4
I. Background of the study..................................................................................................................4
II. Statement of the problem...............................................................................................................5
III. Objectives....................................................................................................................................5
IV. Significance of the study..............................................................................................................6
V. Scope and limitations......................................................................................................................6
VI. Conceptual Framework................................................................................................................7
VII. Operational Definition.................................................................................................................8
Chapter 2.....................................................................................................................................................9
2.1 Related Literature..............................................................................................................................9
2.2 Related Studies................................................................................................................................18
2.2.1 Foreign Studies.........................................................................................................................18
2.2.2 Local Studies.............................................................................................................................28
Chapter 3...................................................................................................................................................32
3.1 Research Locale...............................................................................................................................32
3.2 Population and Sampling.................................................................................................................32
3.3 Research Instrument.......................................................................................................................32
3.4 Validity of the Instrument................................................................................................................32
3.5 Data Gathering Procedure...............................................................................................................33
3.6 Research Design...............................................................................................................................34
3.7 Exploratory Data Analysis......................................................................................................37
3.8 Data Splitting...................................................................................................................................40
3.8.1 Train Data.................................................................................................................................40
3.9 Build and train the Model................................................................................................................40
3.9.1 Algorithm..................................................................................................................................41
3.9.2 Hyperparameter Tuning............................................................................................................41
3.9.3 Cross-Validation........................................................................................................................41
4.0 Evaluate Performance.....................................................................................................................41
5.0 Deployment.....................................................................................................................................42
Chapter 4………………………………………………………………………………………………………………………………………………

4.1……………………………………………………………………………………………………………………………………………

4.2……………………………………………………………………………………………………………………………………………

References................................................................................................................................................43
Appendix...................................................................................................................................................46
Appendix A................................................................................................................................................47

List of Figures

Figure 1 Conceptual Framework......................................................................................................9


Figure 2 Flowchart.........................................................................................................................37
Figure 3 : ERD.................................................................................................................................38
Figure 4 Activity Diagram...............................................................................................................39
Figure 5: Example of Histograms (KHUSHIS, Exploratory Analysis Using Univariate, Bivariate, and
Multivariate Analysis Techniques, 2021).......................................................................................40
Figure 6 Example of Bar Chart (KHUSHIS, Exploratory Analysis Using Univariate, Bivariate, and
Multivariate Analysis Techniques, 2021).......................................................................................41
Figure 7 Example of Correlation Map Scatterplot (Glen)..............................................................42
Chapter 1
Introduction

I. Background of the study

Class retention is a problem for most of the schools in the Philippines. Students that are
likely to enroll in college do intend to finish, but at the end or in the middle of the semester or
year level they intend to drop out, shift to other courses, transfer to another school, or will not
continue their college degree. In APUS, an online platform where students can enroll and are
given the opportunity to be educated, the top 2 reasons that most students drop out from the
course are: Lack of time to complete the program and/or the program was very hard. And
others perceived the curriculum as too easy to take. And the students decided to transfer
because of a more recognized college program (Boston, Ice, & Gibson, 2015).

Students' experiences are physical proof of the legitimacy and purpose of an


institution's goal, which is why private higher education institutions (HEIs) understand the
importance of student retention (Scholder and Maguire, 2009). As a result, gathering student
opinion on specific aspects of the institution, such as value, resources, academics, faculty,
advising/supporting services, social life, extracurricular activities, educational goals, and future
preparedness, is vital. Higher education is increasingly stressing persistence rates for a variety
of reasons, including financial needs, reputation enhancements, and perceived admission
advantages. However, the national rate of student persistence and graduation has remained
relatively constant over the last decade (NCES, 2005). Some universities are beginning to
delegate retention responsibilities to the chief enrollment officer, while others delegate it to
student affairs personnel (Scholder and Maguire, 2009). (Zerna, Cruz, & Nuqui, 2014)

Western Mindanao State University is one of the Philippines' competing academic


institutions. However, similar to other academic institutions, there is a difficulty with student
retention in classes. As a result, the researchers are conducting this study to develop a tool that
gathers data and builds an analysis of how students manage their studies while remaining
committed to their chosen course. And, as a result, the student retention rate would improve.

II. Statement of the problem

Student retention in college has been a concern for schools such as universities because
of the increased number of drop-outs. In our circumstances nowadays; instability, insecurity,
and self-doubt have been common topics within the lives of college understudies as their
education and career plans move due to the coronavirus widespread. Moreover, other
circumstances result in the increasing amount of drop-outs in college that have not been
highlighted yet, for the possible reason that the understudies may lack consultation from the
universities. Furthermore, the lack of concern of the universities to the circumstances is also a
factor that contributes to the increasing amount of drop-outs in college. For the said reason,
the chance of increasing the probability of college graduates during the pandemic is unlikely to
happen.

III. Objectives

The primary objective of this study is to predict the retention of the Students of the
Institute of Computer Studies.

Specifically, the study aims to focus on the following:

 To develop an effective model which identifies and predicts the students’ retention
in school.
 A model that will provide 75% accuracy
 To help identify:
o The variables or factors that most likely affect the students’ retention in class.
o And also the background and status of the student which enables them to
have retention in class.
IV. Significance of the study

The result of this research will be significant too:

CHED. To be able to understand and gather information about the different techniques and
ways of how students cope with their studies, to thoroughly provide knowledge to the
educational institutions on what to improve or needs to be implemented in ensuring the
students’ class retention.

Western Mindanao State University. Help the university determine how to enable students
to effectively cope with learning. Furthermore, to be able to analyze the constant change of
circumstances or situations and provide solutions to preserve the class retention of
students.

Institute of Computer Studies Department. To help the institute in predicting retention of


understudies. And by that, the data acquired may be used to increase the retention rate of
understudies.

Instructors and Professors of Institute of Computer Studies. To help the Instructors have a
better understanding of their students, and to help them improve their ways of teaching the
students to cope with the constant change of circumstances.

Students at Institute of Computer Studies. Students can express their opinions and provide
the necessary details to predict the possible reasons how they were able to cope with their
studies. In this way, they can maintain their class retention.

Researchers. To help and direct them for future research associated with the study of
retention prediction for the Institute of Computer Studies in Western Mindanao State
University.

V. Scope and limitations


The study is focused on the students at the Institute of Computer Studies at Western
Mindanao State University.

The raw data to be used in this study will be based on data given by the students during
enrollment saved from the data warehouse of the Institute of Computer Studies
Department at Western Mindanao State University.

The study will be conducted at the Western Mindanao State University (WMSU),
Zamboanga City, Philippines.

The study will be limited to a specific department in Western Mindanao State University
which is the Institute of Computer Studies.

The students in other levels such as elementary and high school are not included in the
study.

In providing retention prediction the researchers are going to use Logistic Regression.

VI. Conceptual Framework

Figure 1 Conceptual Framework


VII. Operational Definition

Onground Classes – the same definition with face-to-face classes wherein the students need
to go to their physical classes.

Demographic Questionnaire – questionnaire that concerns the basic information of the


student that also helps the survey to accomplish its goals.

WMSU – Western Mindanao State University, a University that is located in Zamboanga


City, Philippines.

Logistic Regression – one of the algorithms in Machine Learning.

S.Y. = stands for School Year

Retention – the one who doesn’t quit or did pursue, continue and finish the course.

Prediction – using the basis provided, it can predict already the result of one action.
Chapter 2

2.1 Related Literature

Staying the Course: A Study in Online Student Satisfaction and Retention

Despite massive numbers of online learners in higher education retention of students in


their classes does give great concern to most of the universities. And so PSOL or Priorities
Survey for Online Learners has been designed to determine factors that help students to
have retention in their classes. PSOL is sent to the students that have been enrolled in
Midwestern State University in an Online Class. The primary participants are the
Undergraduate Students, a few graduates students, and those students who didn’t finish
the course. The data collection began in December 2005 to January 2006, 4 weeks have
been allotted to each student in passing the Online Survey in Noel Levitz Company. For
the analysis of data SPSS or Statistical Package for the Social Sciences is being utilized.

The survey has been sent to the students twice during the fall of 2005, the second
invitation is being sent to the students who failed to fill out the first one. The Institutional
Survey Questions that have been utilized are the following: Faculty (Teachers, Staffs, etc.)
responsiveness to student’s needs, how are the Faculty provides Online Instruction
(quality), Timely Faculty feedback to students, Frequency of Student-Professor
Interaction, availability of Financial Aid, and the importance of Student-Student
Interaction. At the beginning of the survey, students are being asked and classified as
Graduates or Undergraduate Students. And according to the survey, 42% of the students
didn’t complete the course, despite the situation they did fill out the survey form to
allow them to share their opinion regarding the factors they have been experienced in an
online class.
From the result of the survey, the highest score that the students have chosen is the
Faculty responsiveness to student needs. This means Instructors’ sensitivity to students’
needs is a big help to increase the number of students retention in an online class.

Second, is the Quality of Instructions given by the professors online. Providing specific
and understandable instructions can help students to create better performances that
are focused on the topic, it will serve as guides in making activities a more productive
one even if it is just online.

This is followed by the Instructors’ timely feedback about the Students’ Performance.
Feedback will surely help students to look for the missing and lacking parts of their
output more easily, and by doing so it will help them to learn and have a better
understanding of their activity and provide better revisions of the output not just to the
present time but also for future purposes.

Then the institutions respond when information is being requested. When a student is in
an online class they have a hard time going to the physical faculty room of their
department, asking for specific information or clarifications. And so the Department
needs to promote a healthy virtual environment allowing students to ask for any queries
regarding the field.

Next is the student-instructor Interactions, despite virtual learning students and


Instructors do need to have continuous interaction. By doing so the student will feel that
he is not alone and there is a guide who is always ready to help and guide him.

Followed by the Financial Capacity of the student, even if Online Classes are much
cheaper than face to face but still, the financial capability of one student is needed to
have support in having retention on their online class.

Last is the student-student interaction. Though this factor is considered least in an online
class this one can help students to have retention in online classes. It will help students
to feel that they are not alone. Some individuals are doing the classes, he can
communicate with them to help students have stronger bonds, and he can also ask for
some help from his peers (discussions, advice, etc.).

Un-completers are also being included in this survey and the frequent reasons that the
students have for not completing are Time Commitments (60.16%), Personal Problems
(15.15%), Instructor-Related Problems (12.12%), and other (12.12%).

For the overall result of the study, completers claimed that they are satisfied with their
experience in their online classes. But with the combination of the Completers and Non-
Completers Survey Result improvements must be done for future online learners.
(Herbert, 2006)

College Student Transition to Synchronous Virtual Classes during the COVID-19 Pandemic in the
Northeastern United States

During spring in 2020 (February to mid-March) COVID-19 started in the United States
resulting in closures of school grounds throughout the country. And so a lot of transitions
are needed to consider to help students to pursue and continue their learning career,
and so Virtual Learning is being utilized. But before starting the course the Instructors
need first to uplift the confidence of the students that they can and will succeed in an
online environment, encourage them to develop self-learning skills, assist them and
provide institutional resources that are useful in the course. The Instructors do also need
to carefully come up with strategies on how to provide instructions throughout the
course, considering that most of the students are first-timers of Online Education. The
usage of the Learning Management System (LMS) can help Instructors improve their way
of providing lessons and discussion materials to the students. Provide students the
resources, assigned projects and assignments, and other purposes.

During COVID-19 Pandemic students faced extreme experiences increasing stress and
other negative emotions, and so great understanding for every individual especially to
the students is needed. This study focuses more on students' perceptions about the
changes of classes from face-to-face to virtual classes, and the feelings or emotional
reactions during the online classes. And Instructors are encouraged to utilize LMS in
providing the lectures, posting of assignments, and other educating stuff.

There are 4 Research questions that the Researchers are trying to get answers from the
respondents and these are How would students evaluate the usage of LMS provided by
the Instructors?, How satisfied are the students in terms of course delivery, content, and
the structure provided by the Instructors?, What did the students felt after the transition
from face to face class into a completely virtual class? What are students' opinions and
comments on what should have been done, what to stop, and what to continue during
Online classes?

To collect data the study was being conducted with 148 participants who are
undergraduate students from a Liberal Arts College in Maine, USA. With the range of age
from 18 to 58 years, the samples are being picked through a voluntary sample.

Included in the Survey is the Demographic Questionnaire that inquiries about age,
gender, GPA, academic major, etc. To find if such information does have a relation to the
students’ retention in an online class. The students are also being asked how many
classes they are taking during the semester and to also rate how their professors utilized
the LMS on every subject.

The questions are focused on if the LMS truly helps and assists them in completing their
course (recovery of the transition from face-to-face into pure virtual class, and the
availability of documents that truly assist them in completing a task). In terms of Course
Delivery, Content, and Structure the students are asked how many subjects they are
taking during the semester and will rate professors’ ability in delivering it effectively to
the students. The study utilized the Likert Type Scale (1=strongly disagree up to 5 =
strongly agree) in choosing the level of Agreement. And in rating for Students Emotions
in the survey the students are represented with 12 emotions and the students can have
multiple checks from the boxes given that appropriately represents their feelings during
the conversion of the classes. Also, the students are given 4 open-ended questions in
which they can provide personal opinions and these are 1. During the experience of
online classes in spring 2020, what should the professors keep doing? 2. Start doing? 3.
Stop doing? 4. What can you suggest that the Professors need to improve in terms of the
delivery of the Lessons?. To make the survey applicable email invitations have been sent
to the participants being chosen at the end of the semester and are open for fill up with a
1-week duration.

To analyze the data being gathered Descriptive Statistics and Qualitative Content Analysis
are being measured using IBM SPSS Statistics for Windows, Version 25.

With the first variable which concerns the usage of LMS, the result portrays that the
Instructors do have an effective usage of the virtual system (LMS) to provide the lessons
and other necessary documents for the course, and the students can access it effectively
for learning purposes. In terms of the Course Delivery, Content and Structure according
to the result not all professors are consistent enough to facilitate students in a virtual
class.

The next variable is about the level of agreement, and according to the result, the first in
the list that has the most chosen of priority in a virtual class is Students’ preferred grades
available on LMS, and the Course Content and assignments are communicated and
announced promptly (4.86%). Followed by Students’ preferred course syllabus and
schedules are available in LMS (4.85%), then the Course, Rubrics, and assignment details
are available in LMS (4.79 %), Email must also be utilized to communicate with the
students regarding the course (4.62%), Lectures need to be recorded so that it can be
reviewed anytime (4.21%), Class sessions are preferred to be held using Google
Hangouts, Virtual Classroom, etc. (3.98%), Classes are also preferred to held during the
registered schedule (3.72%), Also students did appreciate if attendance is being
recognized (3.70%), lastly Instructors’ Active Student Participation during Online Class
(3.47%).
Regarding the feelings of the students mostly felt uncertainty (59.5%), Anxiety (50.7%),
Nervousness (41.2%), happy (4.7%), excited (3.4%). And for the 4 open-ended questions,
the result for number 1 is Communication with the students, understanding, weekly
updates (in any applications that are available and applicable, and if possible to all
platforms that are being recognized by the class), and providing a better-structured
syllabus. For number 2, Professors should keep updating students regarding the class
lessons (ppt, lectures, etc.), posting presentations and grades using Bright space. For
number 3, Professors should utilize Google Hangouts, Zoom for classes that require
attendance and presence, also keeping of attendance and updates of daily changes and
assignments are being favored, continuous usage of video elements such as pre-recorded
lessons, and also recommending to use OneNote which is installed already on every
windows’ computer for sharing the lessons. For question number 4, Professors’
understanding, adaptation, being flexible, and keeping relationships with the students
must be maintained. Additional answers that have been recognized are being more
accommodating and considerable with deadlines, having discussions first before giving
assignments, and being considerate to the tasks because subjects that the students are
enrolled to are more than one expecting that other subjects also giving tasks necessary
for the subjects. And also letting students share their opinions regarding the topics or
lessons. (Murphy, Eduljee, & Croteau, 2020)

Retention of University Teachers and Doctoral Students in UNIPS Pedagogical Online Courses

MOOCs or Massive Open Online Courses and open educational materials allow students
who are not financially privileged to participate in class, working people, and the
financing are much lower than face-to-face classes. SPOCS or Small Private Online
Courses is also a type of online class that is intended for a small group of students in
private. Open Online Platforms are online courses, certifications, and degrees that cater
to thousands and millions of students.
Learners do have different intentions in joining online classes. The first is to complete the
class in the first place, or to have access to the different learning materials, and/or to
allow themselves to have an inspiration in studying. Despite the privileges that the
student can gain in Online Classes, still financial and scheduling difficulties are still getting
along the way.

And so, MOOC tries to design classes with these specific goals: Asynchronous and Flexible
Schedules, and provide study materials and lectures according to students' skills and
knowledge. And to reduce the number of drop-outs: buddying or having social support
and continuous social communication with classmates and professors, feedback the
ability to exchange answers and queries can promote a healthy way of online discussion,
and briefing which is part of the beginning of the course and which will give the first
impression to the students about the course, are being developed.

After completion of the first of the year in MOOC majority of the students didn’t enroll
for the next years of classes, they didn’t pursue to continue their MOOC. But those
students who still have retention to their classes have the following traits and practices:
continuous interaction and communication with a facilitator, students who have past
experiences in education, students who are from the beginning until the end do have
determination in completing MOOCs, have ways of self-learning, focused on the
application of knowledge and skills not just pure thinking, memorizing and analyzing, and
interaction and communication with peers.

UNIPS or the University Pedagogical Support is an open learning environment that is


designed under the agreement of 8 Universities (Turku, Aalto, Hanken School of
Economics, Jyvaskyla, Lappeenranta University of Technology, Oulu, Tampere, and
Eastern Finland). This was designed to help aspiring teachers and students attend
pedagogical training even if they are outside Finland. Using UNIPS research questions are
being formulated to come up with a better result necessary for the study: When do
participants likely to drop out?, Do age, gender, faculty, and position at the university
correlate with the students’ retention in classes? Are there students who stopped and
are likely to return? And later on, complete the modules? How common is such
behavior? Does giving pre-tasks at the beginning of the class help students to have
retention in their classes?. To come up with an answer to these questions Quantitative
Data and Statistical Analysis are being utilized. Participants are: University staff members,
Employees, teachers (90), and Doctoral Students (314). Female participants were 231 in
total and male participants were 173, and both are in the range of 21 to 65 years of age.

According to the result, 154 participants out of 404 (38.1%) didn’t finish the module. And
half of the dropouts happened after enrollment because some participants who have
finished enrolling didn’t register in Moodle. Meaning few numbers of students did drop
out of the class. And as for the second question concerning the demographical
information, there is no statistical difference was found. For short any individual with
many experiences, or not, as long as there is perseverance to complete the class then
he/she will accomplish it. And about whether Faculty participants have an impact on
dropout rates, the passing and completing of modules are also being considered in the
study. But with the result provided it is not statistically significant. Meaning the factors
discussed above did not correlate with retention, including the age, position at the
university, and the faculties (passing or completion of modules). And 22.1 % of students
who have dropped out did return and over half completed. (Laato, Lipponen, Vilppu,
Murtonen, & Lehtinen, 2020)

Comprehensive Assessment of Student Retention in Online Learning Environments Article

The study focuses on the fully degree-seeking and Online Undergraduate learners that
completed a minimum of one course at the American Public University System in 2007,
the variables, and factors that make students have retention to their online classes.
Despite the most goal of the study lots of Institutions are concerned about the rise within
the attrition of scholars and then the study is conducted for the explanation of getting
ideas on a way to improve ways of Institutions in handling online classes. Data has been
collected on New Year's Eve, 2009, with 20,659 respondents. The respondents
(Instructors and Students) are residing in numerous areas or places allowing them to cut
back costs of the development and maintenance of facilities that are traditionally utilized
before the transition of classes.

The research is curious about finding APUS’ Academic preparedness on a web basis. Non-
traditional Age Students were 18-24 years old of the learners have changed dramatically
during the 21st Century, it includes the Part-timers and Adult Learners, and then the
Researchers are interested in how the institution coped up with the change. Another is
that the Military Students, as we've known from the beginning Military courses are more
taught on a face-to-face basis instead of online classes, then the research is trying to
search out answers on how the Institutions try to produce better education to military
students using online classes. Another is that the Minority Students and Students of
Gender, the Research is trying to seek out a connection between the retention of
scholars in online classes and therefore the Ethnics, Gender, etc. involved. The study
utilized Descriptive Statistics and multivariate analysis in analyzing the gathered data.
And also the following variables are the guides of what result the study is specializing in,
the kind of student enrolls at an internet institution, and Factors that make students
have retention in Online Classes.

The study first collects the demographic data and also the grades that are being collected
and downloaded using the students’ applications or the enrollment data from the
Institution(APUSwarehouse).
And from the results of the survey, 87% of APUS undergraduates applied to transfer to a
different college or university, quite 82% of the undergraduates commit to transfer their
academic credit hours. 1% has been graduated, 25% remain active or still pursuing the
course, 74% were disenrolled. One reason for dis enrollment is thanks to students’ part-
time jobs. With some working adults, from the traditional duration of graduating the
course, they'll pursue the study by extending due to the roles they need, and then due to
impracticality (lack of finance due to responsibilities with the family, etc.), they decided
to disenroll. One more reason is that students don’t have time to complete the program,
and also the program is simply too difficult including the fact that it’s just pure online
classes and no physical contact with the professor has been experienced. And also the
reason for disenrolling for transfer is that after achieving high remarks from the APUS
institution students have decided to transfer to a more recognized university.

The APUS takes the initiative to disenrollment of students if the scholar is inactive for 12
successive months. Using the GPA being gathered through the information from the
APUS warehouse Military students are assisted by the Military’s Tuition Assistance Plan
which requires students to urge grades above F. And if F grade is being earned the coed
must reimburse the schooling to the Institution.

Some students don’t have retention in a web class because in step with the survey 2 or
more institutions are applied and entered by the scholars regarding the very fact that it's
online, and whether or not the scholar is residing far away from the institution he can
still participate in classes. so sometimes thanks to struggle to cope with both courses or
institutions, one must be sacrificed. (Boston, Ice, & Gibson, 2015)

2.2 Related Studies

2.2.1 Foreign Studies

Six Instructional Best Practices for Online Engagement and Retention

The first is Build Community. The instructor needs to make students feel that they are
part of the class, a sense of belongingness to each class as a person and as a student.
The instructor can acknowledge the student/s of the different contributions that he/she
made to the class, and their different insights are well valued by the instructor. By doing
so the student can feel that he/she is truly belonged to the class and not as an outsider
or an unwanted one. The Instructor can also offer students to have a virtual and real-
time way of communication with their peers or co-students. Helping them to engage in
the class even just through a virtual way.

Also, the Instructor needs to have effective communication with the students, by
communicating not just only during the time of the class begun but also before and after
it begins. It helps the Instructor-Students and Student-student to have regular
communication and better understanding before, during, and after the class. And a lot
of different insights of the current or past lessons are being discovered helping the
Instructor to easily clarify the blurred topic of the Lesson.

Second, is Clarifying Instructors Online Course Expectations and Objectives. And the first
step is to create a syllabus that includes: course objectives, learning outcomes,
assignments, evaluation method, student participation requirements, textbook
information, etc. That will serve the students as guidelines on what to expect during the
class. Also, the instructor needs to help students prepare for the course by giving them
pieces of advice on what to do while taking the course to perform a good performance
and have good grades after taking the course. And also allowing the students to ask as
much as they want before the course started, by doing so the Instructor allows the
student to share their personal opinions or misconceptions about the class and help
them straighten the idea about the course and have a clear view about it. The
Instructors need to communicate to the students, offer help and encourage students to
ask, for that to be performed.

Third is Professors Identify the best online tools for interaction with the students. In a
synchronous class, a large and small group of students, discussions are important to
help students engage with the course content. And so the Instructor needs to find the
best and most applicable software and tools necessary for completing the course.
Allowing and encouraging students to speak their ideas through audio recordings or
usage of chats to provide contributions during the discussions. The Instructor does also
needs to provide guide questions in the small group activities to keep students’
attention on the material. Forums are also another way that the instructor can provide
to the students for expressing insights regarding the course. Requiring students to
participate and engage with the discussion posts, responding to classmates’ posts, helps
the class to have a healthy discussion and peer-to-peer learning.

Fourth is the Professors need to promote the exchanging of ideas maybe Teacher-to-
Student and/or Student-to-Student. The professor can allow the student to continue
working (accessible) the discussions, lessons, projects, or assignments 24/7. This will
help the students to access lessons or any readings, giving them more time in
understanding better the lessons and be prepared before any proceedings of activities.
Most of the students in a class share the same ideas about a specific topic, and so
students may share it to the discussion forum, maybe supporting or contradicting each
other but ensuring it is still a healthy discussion. The Instructor can indicate that
students can’t just agree with each other but also add ideas and perceptions to continue
and promote healthy conversation, by giving real-life situation applications and personal
reflection on different topics. By doing so, the Instructor can deeply understand the
different perspectives of every student.

Fifth is providing timely, relevant, and actionable feedback. By giving concrete and
specific examples and steps that will help students to improve their knowledge and skills
about the task. It should have positive feedback by complementing the parts where the
students did a great job, and also a constructive criticism which will help the students to
easily find the wrong, missing, and lacking part of their works. By commenting, providing
follow-up questions for more depth understanding, looking for pieces of evidence and
explanations of why their work is incorrect, and providing a summary of the overall
context of the activity. By doing so the Student can easily improve the parts where the
Instructor has criticized.
Sixth Is creating Student-Centered Environment. Not just focusing on a strict schedule of
the student but also to be able to expect unexpected events or situations (tragedies,
illness, etc). Adult Students might have full-time jobs, expected family demands, and
other personal responsibilities that may require the flexibility of schedules. Instructors
must also have a better understanding, flexibility, and sensitivity to each students’
situation.

To summarize it all, student-to-student and faculty-to-students interactions and


communications are important to promote students’ retention in an online class.
Instructors need to promote and find essential ways how to have better communication
with the class by utilizing developed systems. Forums are also essential in expressing
one’s idea and so it can be likely utilized by the Instructor to reach individual
perceptions of the topic and to also ensure that everyone is participating and providing
necessary support to the class. (Poll, Widen, & Weller, 2014)

Use Data Mining to Improve Student Retention in Higher Education - A Case Study

Online classes still show serious retention issues, which are required to be addressed.
Several studies are conducted specifically to watch when and why students withdraw
from their courses. There are cases when the scholars don't seem to be used or
acquainted with the net educational delivery system, they're more apt to be frustrated
with the disparities existing between the long-term memories of their face-to-face
course associations and therefore the new realities of online learning that they're forced
to face. Students tend to speak with their instructors more to urge help with an issue
and fewer to require actual guidance to facilitate their learning. Another consider this
equation is that several online classes follow constructivist models of teaching, wherein
learners are given props and aids to learning but are left to resolve complex problems
on their own. As a result, the web environment can dwindle guidance-oriented, which
successively is also non-conducive to retention (Bawa, 2016).

Determining and measuring student perception may be used jointly thanks to


identifying those variables that are the foremost important to students (Herbert, n.d.).
To supply meaningful analysis, data processing techniques may be an answer to supply
ideas beyond the information explicitly stored. Data processing is looking and it's
oriented to individual students. Data processing also can be applied to course data. It
could find those modules that are important for a particular course, as they will cause
more students to drop out, this may help the university evaluate the module suitability,
prepare programs and courses so students will have the best probability of success,
both personally and academically (Oussena, Kim, & Clark, 2014).

Staying the Course Online Education in the United States, 2008

During the fall of 2007 estimated 12.9 million students have been enrolled in online
classes. And the rapid growth of students' engagements in online classes has been
greatly observed. Most of the students are undergraduates taking graduate-level
courses and some are for-credit courses. The rising numbers of unemployed individuals
trigger Universities to provide more online courses for students. Because a decrease in
the number of good jobs encourages everyone to seek education, and so universities
find it as an advantage to have profits to both the school and the student.

Students who chose to have online classes help reduce fuel costs. Meaning the budget
that Is intended for commuting in a face-to-face manner will be reduced or will
completely be ended. And so Online classes are more desirable over the advantage facts
presented. And classes that accept working adults help institutions to be positive about
potential enrollment growth.
Public institutions favored and saw online classes as the most important strategy for a
long-term strategy in educating. But most of the private and baccalaureate institutions
proclaimed that online education is a critical way of educating in long-term duration.

Providing online classes have limitations in each field, some courses viewed it as a
critical strategy in education. And with the result of the survey during the fall of 2007 in
public institutions, the first course that has the highest penetration of Online Classes is
Business, next are Liberal Arts and Sciences, General Studies, and Humanities. Next is
Health Professions and Related Sciences, next is the Education, the fifth is the Computer
and Information Sciences, next is Social Sciences and History, another is the Psychology,
and lastly is the Engineering. Engineering Is the course that has the lowest penetration
in online classes provided by public institutions.

And for the Institutions who have Institutional Control, still, during the fall of 2007, the
first course that got the highest penetration is the Business, next are Liberal Arts and
Sciences, General Studies, and Humanities, then the Health Professions and Related
Sciences, next is the Education, the fifth is the Computer and Information Sciences, next
is Social Sciences and History, another is the Psychology, and lastly is still Engineering.

The summarization of the report entails that Public Institutions are most likely to
provide online courses to students, which leads the overall institutions in terms of
online offerings. And so there is a relationship between the size of the institution
(overall enrollment) and the opinions concerning online classes and activities being
provided. Bigger institutions began classes sooner, have a more positive view about
online classes, and provide more courses than other institutions.

Online Classes allow institutions to expand their geographical location. The students can
just reside wherever (local or regional) as long as the internet connection is available.
Institutions can provide services and reach students even outside local areas. And
according to the study private institutions are more likely to offer students outside local
and regional areas, while Public remains local. (Allen & Seaman, 2008)

Retention, Progression and the Taking of Online Courses


This study focuses on the 3 modes of classes, purely face-to-face classes, purely online
classes, and the combination of the two. The Researchers chose 5 on-ground
community institutions (213,056 student records) 5 on-ground four-year universities
have been chosen for purely face-to-face classes (113,036 student records) and four
Institutions for Online Classes (330,166 students records). It also examines the status,
gender, and age of the students to explore if there is a connection between the students
having retention to their classes. Data are being gathered between September 2009 and
December 2012 and are being collected using the Predictive Analytics Reporting (PAR)
Framework. PAR is a tool that provides tools and resources in identifying factors that
may represent risks of the students that may help Instructors to find solutions for
students’ improvement to success.

Data included in PAR are demographic information (age, gender, ethnicity, etc.), prior
academic information (High school GPA, earned college credits, etc.), latest academic
information (majors pursued), and students’ taken course information.

And the students are considered to have retention to their course if they were enrolled
during the duration of 12 to 18 months. Progress of the student from the first 8 to 9
months was included as credits ratio that sums up to the credits earned.

There are 3 research questions that the study is trying to find answers to, and these are
1. Do college students who are in online courses have a poor completion rate?, is their
retention rate lower than Onground Classes? 2. Is delivery mode greatly affect the
retention rate of the students? 3. Are there a difference in retention rates between the
different ways of delivery of lessons using Online classes?

To come up with a better result Exploratory Analysis has been utilized during the study.
Because the study is not a controlled experiment (because of the different institutions in
different places) the concern of the researchers were if the different delivery modes
were just inherent differences risen to the different Institutions, and so the researchers
are not trying to focus on the students’ success but to the significant variables that help
students have retention to their classes, and for it to be possible Logistic Regression
Model (LRM) is being utilized. The result or effect of the delivery mode to the student
will be added and posted to the LRM to determine the relationship between the
students’ retention and the course delivery mode.

For Purely Online Classes study considered the following colleges: Public Community
College in Southwest, Public 4-year University in East Coast, and two for-profit
Universities. The students were in the range from 68,000 to almost 100,000, and those
students who receive Pell Grants range from 20% to 80%, and female students have the
high average of enrollees rather than men. And 5 Primarily Onground Community
Colleges were public institutions located in Florida, Ohio, Texas, Washington, and
Hawaii.

From the Survey being conducted, enrollees in Purely Online got the highest average
rather than the Blended and Purely Ongound Courses. But according to the result
following the first enrollment of the students, 58% of the students did have retention
taking courses using the Online Classes and Onground classes, 51% of the students
taking Onground Courses, and the least which is 30% taking purely online. And using
Odds ratios, Different Delivery Mode has been measured. According to the result
Courses with blended mode (Online and Onground) got the most average which is 19.2
followed by Purely Onground (16.8) and Purely Online (10.2). Meaning the one that got
the most is the combination of Online and Onground Classes, meaning with the usage of
the 2 factors help student in their classes better. And taking courses using purely Online
Classes got the very least average which means that the way (Online Classes) is not very
helpful at all to some students.

And for the second question, Does the Delivery Mode affects particular groups of
College Students? Between 12-18 months duration has been utilized. Students with No
Pell: Blended got 53%, Fully Onground got 47%, and Fully Online got 22% of retention. A
student with Pell Grants: Blended got 62%, Fully Onground got 57% and Fully Online got
42%. And the result indicates that Students who are granted Pell Grants (type of
scholarship granted only to some students) do have a great average of retention than
those who don’t have Pell Grants. Meaning students do need some financial support to
have retention in their classes. This factor also needs to be considered in some
institutions to help students.

And the retention by Gender. According to the result, more women do have a retention
rate rather than men students. Women (51%) students were more likely to take Online
Classes than men (43%). But the study indicates that the delivery mode is not affected
by gender. And for retention by Age, the study was composed of participants with 25>
years of age and 25< years of age. And according to the result greater percentage of 25<
years of age got the greater average retention rate in Online Classes rather than those
students who are 25> years of age. And on the contrary 25,> years of age participants
have a greater average of retention on On-ground Courses. The study suggested that
younger students who are taking Online Classes are a bit risky, including those students
who don’t have grants while taking the course (Blended, Fully Online, and Fully On
ground or Face-to-Face). (Scott, Swan, & Daston, 2016)

Designing Online Courses to Promote Student Retention

Retention is meant by pursuing the course from the beginning until completing it. And
rates of student who have been dropping out is increasing and most of the reasons are
family or personal obligations, work, marriage, No connection with other peers, limited
experience as College Student, limited experience in online courses, and sometimes the
courses that make no big impact on the student (interest) is also included as one of the
reasons. The study focuses on these factors: Personal Variables that include
Demographical information of each student, Individual variables that pertain to student
skills and goals, past educational experiences, etc. Institutional Variables that pertain to
social factors, vision, mission, and policies, or any factors that are related to the
Institution or School. And last is Circumstantial Variables that touch the outside life of
the student such as work, family obligations, etc. And to make this study possible the
researchers use the Quality Matters Standards. From the 11 courses that are being
offered 95.5% has been the retention rate from the survey being conducted, and so this
study is being developed to help other institutions to provide examples that may help
them to come up with better strategies in designing courses.

This study focuses more on the Instructional Materials that may give a better impact on
students for designing the courses, and provide better ways of achieving subjects’ goals
throughout the semester. The elements provided are a general view that provides
technical, academic, and student support services that will make as the main features of
Online Courses.

The study is mainly conducted with Psychology Students as the main respondents. The
activities provided are designed to be accessible to students with different skills,
disabilities, etc.

After giving materials that are needed for lectures (PowerPoint presentations, video
recording lessons, etc.) the students are required to review lectures, watch videos
related to the lecture, engage in interactive crossword puzzles, etc. (at least 3 activities).
After that, the students are required to answer the following 4 questions: What are the
important concept/s that you have been found or gained after completing the activity?
Why do you think that this idea is important? How can you apply the concept you have
gained to your life? and What idea or concept from the activity raised questions or
curiosity to your mind?. In submitting answers the Instructors have been
accommodating answers through e-mail attachments (pdf files, photographs, etc.), fax,
and written works.

The Instructor does need Instructor-Student Interaction and student-student Interaction


for making the course more attached to learners. Instructor-Student Interaction for
lectures, raising queries, etc. and for Student-Student Interaction is for peer discussions
and group works. (Dietz, Han, & Fisher, 2019)
2.2.2 Local Studies

Sentiments Analysis on Synchronous Online Delivery of Instruction due to Extreme Community


Quarantine in the Philippines caused by COVID-19 Pandemic

Covid-19 hits all around the world and includes the Philippines disrupted education learning.
With this students and teachers are facing problems about how to discuss and deliver the mode
of instruction. Through online discussions, teachers can still give online activities to their
students but not all students especially the students of the College of Business and Public
Administration cannot afford their internet connection but still, they are trying their best to fit
with this situation since it is new to all of us. Since virtual classes have become the new norm of
learning for the students and teachers. With this, the researchers prepared a survey
questionnaire using open-ended questions created using Google Forms and group chats using
messenger to the students of Pangasinan State University who are the College of Business and
Public Administration to reveal the sentiments of the students given synchronous online
delivery of classes.

According to the survey, more than half of the students have access to internet connectivity in
Northern Luzon, which shows that there are 58.71% have provided the information. The
majority of the students 66.55% of Pangasinan State University have a negative sentiment
about this delivery mode synchronous. Reasons for this majority of the students might face
problems with this synchronous delivery mode while 29.39% of students are neutral means
they are not sure If they might face a problem or not then lastly 12 students have a positive
about it and may not face a problem with online classes. The negative result of the survey from
the respondents most of them cannot adopt the new norm for online classes. And it also shows
that the students of the College of Business and Public Administration are not yet ready for
synchronous classes. (Pastor, 2020)

Barriers to Online Learning in the Time of COVID-19: A National Survey of Medical Students in
the Philippines

During the early part of 2020 Covid-19 disrupted the medical undergraduate students
worldwide including the Philippines. It affects billion tons of students around the world who are
facing difficulties with the new norm for educational learning. With this virtual classes have
become the new norm for learning methods for the students in the Philippines. No face-to-face
classes. Medical classes have been suspended in teaching and removing medical students to
their clinical placements when this disease had known which is Covid-19 has no treatment or
vaccine was available and it is too contagious. To sustain medical education it is necessary to
pivot to online learning, e-learning, or internet-based learning as their primary means of
curriculum delivery. Online learning has advancements to the medical students wherein easy to
access information, cost-effectiveness, ease of standardizing and updating content, and
enhancements of the learning process.

The survey has been sent to the medical students from the University of the Philippines,
announcements of being changed being implemented due to the Covid-19 crisis. Using a 4-
point Likert Scale (strongly disagree, disagree somewhat, agree somewhat, or strongly agree)
Using the background data they developed 23 questionnaires that collected demographics,
medical school information, access to technological resources, study habits, current living
conditions, and views on online learning.
According to the survey a total of 3, 813 responses. 75 (2%) was removed due to duplicating
their email addresses or student number. 68 (2%) respondents were not medical students.
Hence, we included 3670 responses (96%) in the data analysis. This represented 15% of the
estimated 25,000 medical students in the Philippines. And 1, 153 (31%), first-year; 1, 015 (28%),
second-year; 863 (24%), third-year; and 639 (17%), fourth year. The mean age was
23.8±2.4 years. At a ratio of 2.2:1, females (n=2468, 67%) outnumbered males (n=1109, 30%).
There were 39 (1%) who identified as no binary. On-device ownership 93% of the medical
students have a smartphone, 63% have a tablet and 83% have a laptop or desktop computer.
(Baticulon, et al,.. 2021)

Online Classes and Learning in the Philippines during the COVID-19 Pandemic

The Covid-19 pandemic brings a lot of disruption around the world. Millions of people around
the world had affected by this disease. The life of a person is in danger cause of contagious
diseases. So the Philippine Government advises the people to stay at home and implemented
health protocols for everyone, in which below 17 years old and above 60 years old to remain at
home to stop the contagious virus from spreading. It affects both offline and online delivery
modes of education. Not all students can afford the internet connection and some of the
students had a weak internet connection due to their place, some of the students don’t have a
smartphone or gadgets to use.

From the interviews and surveys conducted from different institutions, there were 29
institutions; 22 are private schools, Universities, and Colleges, 7 State Colleges and Universities,
and/or local colleges in universities located in Metro Manila, Batangas, Cavite, Laguna, Bulacan,
and Lanao Del Norte. 62% of the respondents were educators and 38% were students. The
inputs were gathered on how the educators communicate to the students and the tools being
used by the institutions. In LMS (Learning Management System) they used free versions like
Google Classrooms, Moodle, Edmodo, Blackboard LMS, and Schoology. While in Video
Conferencing they used Zoom, Google meets, and collaborate. In communication platforms,
students and educators can communicate through Facebook Messenger, SMS, Google
Hangouts, Facebook, Viber, and WhatsApp.

Some of the Institutions opted to have face-to-face virtual classes using video conferencing and
posted activities, quizzes, lectures through LMS. With this students and educators are facing a
lot of problems when in terms of poor internet connection, some of the students do not own a
smartphone or laptop for online classes, lack of knowledge in using technologies, and many
more concerns. (Ignacio, 2021)
Chapter 3
Methodology

3.1 Research Locale


The study will be conducted in Western Mindanao State University, this place is selected
since the focus of the study is the students of the said institution. The study will be
implemented on both the Computer Science and IT students of the Institute of Computer
Studies.
3.2 Population and Sampling
The study will be directed to the students at the Institute of Computer Studies, the
researchers will first utilize the basis of the 1 st year CS and IT Students in the School Year of
2018-2019. By creating patterns to the machine using the grades of first samples during the 1 st
semester of the School Year and the consequent enrollment records starting from SY: 2018-
recent, the machine will be performing the retention prediction of the students.

3.3 Research Instrument


The study will be conducted using the profile of the respondents (grades).

Interval Scale of Measurement

Variable Name Measurement Values


Grades 1, 1.5, 1.25, 1.5, 2, 2.25, 2.5, 2.75, 3
3.4 Validity of the Instrument
The Researchers will ask for permission from the Admin of the Department for the data
of the 1st Year students in the batch of 2018-2019. Following the right guidelines on how to
properly handle the data with utmost care and security.

3.5 Data Gathering Procedure


The researchers are focusing on the Computer Science and IT students of the Institute of
Computer Studies. For the tracking of the behavior of the 1 st-year level colleges and throughout
their entire journey in college, the data will be based on the 1 st Year College Students during the
school year 2018-2019. And the students are being considered of having retention in their
classes if they have finished the entire 4-year course. The 1 st Semester of the 2018-2019 school
year is the main basis for data collection and the consequent enrollment records are the data
that will be used for determining whether the students did have retention to their course or
not.

After seeking approval from the thesis adviser and Admins of the Institute of Computer
Studies Department, the researchers will now begin data gathering.

The researchers will try to gather data from the ICS Department approximately with the
duration of 1 month (but depends on the situation, especially during pandemic time), focusing
on the 1st semester of the school year 2018-2019, basically the first-year students, and the
enrollment records starting from SY: 2018-recent.

The researchers are going to collect raw data of the Computer Science and IT students from
the ICS Department with approximately 200 participants based on the enrollment records
during the 1st Semester of the SY:2018-2019. The researchers are going to ask for permission
from the Admins of the Department, and if allowed the Researchers will carry the data with
utmost care and security.

The Data will be used for thesis purposes only, security and data leakage are held
responsible by the researchers. To protect the respondents’ data, the researchers will minimize
the usage of personal data specifically the name of the respondents. To do this, the researchers
will replace the respondents’ personally identifying information with different identifiers to
ensure that the personal records will be kept confidential and anonymous.

After the collection of data, the researchers need to sort out the most relevant data for
the study, and to reduce the possibility of having non-relevant data, the researchers need to
have a clear selection and review of the gathered data.

3.6 Research Design


Exploratory Research Design will be used for the implementation of the study,
the study will be used to explore and have a deeper understanding of the different
techniques and ways how to predict Students in the Institute of Computer Studies
retention in their classes. Different tools and algorithms are being considered to help
researchers find accurate and more reliable results at the end of the study.

3.6.1 Flowchart
Figure 2 Flowchart
From raw data the researchers are going to prepare it for filtering. Filtering is
used in order to find null variables on the records. The records and the population is
very much limited and so the researchers can’t perform deletion of data, so they will
utilize replacing the nulls with summary (mean, median or mode).

And then they will save data to the database, and they will classify it whether
it is under demographic information and answers from the formulated survey. The
researchers will perform analysis of the data,from the analization they can provide
visualization of each results. After that they will use data to train it with logistic
regression algorithm, they will process data, and provide result which is 0 = Student
of not having retention to their course and 1 for student having retention to their
course.

3.6.2 ERD
Figure 3: ERD

For the student's grades, student id will be the main basis for identifying the student, and for
security and privacy reasons respondents’ IDs will be used. The variables that will be used in the
study are the student’s grades.
The following variables will be the main basis of the Researchers to create a model in
which it will help to evaluate and predict if the student is likely to retent or not.

3.6.3 Activity Diagram

Figure 4 Activity Diagram

Assuming that the proposal and the asking of permission about the necessary
information had been approved by the thesis instructor and the admins of the department, the
researchers are going to extract grades of the student from the database in the ICS department.
After that, the students who did enroll in the first semester of the school year 2018-2019 are
the main basis for data gathering, and they are being considered having retention to their
courses if they have continuously enrolled up to their 4-year course, and those who have
shifted, transferred, or dropped are considered to be not having retention to their course.
3.7 Exploratory Data Analysis

3.7.1 Univariate Analysis


For the data, a thorough analysis is needed to be done to feed the machine with a
smooth run of the data. And so the following tools are used to test and analyze
individual data for evaluation: Mean, Median Mode, the most common and used tool
for analysis of individual data. It can help us to analyze individual data more thoroughly.
Standard Deviation is a measure of the spread of data around the mean. It will greatly
help us to know if the data is expected or unusual to the study. Skewness and Kurtosis
through Pearson’s Calculation will help the researchers which data will give big
contributions in creating the best performance for predicting retention.

3.7.2 Visualization
The following are for plotting of data to have a clear visual representation for the
results:

 Histogram Plot

Figure 5: Example of Histograms (KHUSHIS, Exploratory Analysis Using Univariate, Bivariate,


and Multivariate Analysis Techniques, 2021)
 Bar Chart

Figure 6 Example of Bar Chart (KHUSHIS, Exploratory Analysis Using Univariate, Bivariate, and
Multivariate Analysis Techniques, 2021)

3.7.3 Bivariate Analysis


To establish a sense of relationship with one variable to another, the researchers are
going to have Bivariate Analysis using the Correlation coefficient. This technique will help
the researchers to identify different correlations to the target data, to identify if some data
do have a connection with other data, in between the behavior, features, etc to the
retention of the student in the class.

Frame
3.7.2.4 Visualization
 Correlation Map Scatterplot
Figure 7 Example of Correlation Map Scatterplot (Glen)

The positive correlation will help the researchers to identify which data gives a
positive contribution to other data. If this data has a positive increase, then the other variable
also has a positive increase. And the no correlation will help them identify which variable
doesn’t correlate at all to the other variable. And the Negative correlation will help researchers
to identify which variables have an increase of positive if other variables have an increase of
negative. This visualization will help researchers to visualize the connections.

3.7.4 Dealing with Missing Data


To deal with the missing data the Researchers are trying to replace nulls with a
summary. In which they are going to use Median to fill in the missing data being
gathered. In that sense, the Researchers can fill out the null data needed for the training
of the machine.

3.7.5 Handling Categorical Data

The Researchers are going to use the data:


 Grades of the Student
All the data will be sorted out and will be grouped based on what method of
interpretation of data will be used. For grades, preference for projects/schoolwork to be
done by (group or individual). The researchers will be using a method called convert to
number in which 1 and 0 will represent the two (2) options/data for each question.

3.7.6 Normalizing Data

Z-scores. In which the researchers will use the Z formula and table to identify
inputs which will help the machine to read data more easily and accurately.
Especially to the GPA of the students. And Z-score will create a big help to the
researchers for normalizing inputs with bigger integers and allow machines to read
data without misinterpreting it.

3.8 Data Splitting

Our target population sample is 150, and for training and testing of data Cross-
Validation Technique will be implemented to create clear, consistent, and accurate
results for good generalization and avoid overtraining of the machine.

3.8.1 Train Data

3.8.1.1 Validation Data

150 records are the target sample, in which data will be the first to be trained in the
machine.

3.8.2 Test Data


50 records are the target for testing data.

3.9 Build and train the Model

3.9.1 Algorithm

Logistic Regression is a type of algorithm in Machine Learning in the


researchers will be using for the study, with a sample size of around 150
population. The researchers chose Logistic Regression because it is a better
algorithm for small sample size and classification data.

3.9.2 Hyperparameter Tuning

Bayesian Optimization, in which beliefs or predictions are made


beforehand or the tracking of past evaluations, but it does enforce and update
the belief during the training. And so, from the thorough research, the
researchers decided to utilize this to help them not just to be notified of the
progress but to also track the changes and improvements.

3.9.3 Cross-Validation

The following techniques will be used to identify and help researchers what’s the
best algorithm in getting prediction of students’ retention in online classes during a
pandemic.
 Four-Fold Cross-Validation – in which the data is divided into 4 blocks.
 Leave One Out Cross Validation - in which the researchers will try to test each
sample.

4.0 Evaluate Performance


Evaluation of performance is very much highlighted after the analysis of data
because it contributes to the model of how effective it is in predicting. It helps the
model to utilize techniques that give higher accuracy in predicting. And so the
researchers are going to use the following techniques:

 Recall and Precision – choosing the analysis which outputs creates high accuracy.
 Confusion Matrix – sometimes there are data specifically the false negatives, and
the false positives, and so this technique will help researchers easily identify
those circumstances, and can provide solutions for that.
 Specificity – Researchers will also specify students who are most likely to
transfer, change course, or dropped out. Meaning even the student who is not
likely to retent to the course is also being considered and identified by the
Researchers.
 Accuracy – 75% classification accuracy.

5.0 Deployment

The System will just be deployed as a Model Object. In which it will just serve as one
of the functions of the System, and not as a whole system. The system will be programmed
using the Python Programming Language.
And if the faculty or the Department Head of the ICS find this project to be useful
for the Department, then the Researchers can integrate it into the system depending on the
Language Preferred by the Client.
Chapter 4
Results and Discussions

4.1 Results and Discussions

The data we have gathered is from the 1 st Year Computer Science and IT students’ batch
2018-2019. There are 7 semesters which are 1 st Sem School Year 2018-2019, 2nd Sem School
Year 2018-2019, 1st Sem School Year 2019-2020, 2nd Sem School Year 2019-2020, 1st Sem School
Year 2020-2021, 2nd Sem School Year 2020-2021 and 1st Sem School Year 2021-2022. We have
excluded the Summer School Year 2020-2021 and 2 nd Sem School Year 2021-2022 because of
the unavailability of data (grades).

Figure 8. Enrollment Records

The Illustration shows the enrollment records from 1 st Sem School Year 2018-2019 to 1 st
Sem School Year 2021-2022. The blue represents the number of students enrolled in each
school year and the orange represents those who have drop-out, shifted, and such like
(Currently there is no specific status of what the student is: drop-out or shifted, such as because
of no specific identifier when we have gathered the data).

The results gave us a clear view of how the numbers of the students have decreased
over time, 68% of 207 students that are enrolled in the beginning have their retention in their
course until their 4th Year level. This entails the idea that not all students who have enrolled in
their respective courses will remain enrolled on their course until their 4 th year level in college.
Some have shifted to other courses, some have stopped maybe because of financial status or
personal reasons, and some have transferred to other schools and other reasons. Some
students have returned to school or what we call returnee students, but the graph represents
that there are students who will return to the school and at the same time there are students
who will stop or will not have retention to their course. Meaning equivalent numbers of
students will stop and/or will return to school which makes the graph shows only from a high to
a low point without even having the high number of students enrolled between any semesters.

There are kinds of grades being utilized during the inputting of grades these are (1.0,
1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75) as 3, 5 for Failed, 4 for Incomplete (INC), 6 for Deferred
Grades (DG), and 7 for Drop (DRP). We have experimented with the grades and so the
following are the results:

Figure 9. Second semester 2018-2019

According to this graph there are 133 students who are enrolled to subject CC100 2nd
sem sy 2018-2019, out of 133 students enrolled 81% has a high probability of having retention
to this class and can continue their course, on the other hand 19% of the students can still have
retention to their course.

The graph illustrates the 138 students who are enrolled to this subject. 78% has a
probability of having retention to this course and 22% of the students who can still have a
retention to their course.

Figure 10. First semester 2019-2020


According to this graph 76% of the students who are enrolled to this course has a high
possibility of having retention and 24% of the students who are enrolled in this subjects has the
chance of having retention to this class.

This graph entails us the probability of the students of having retention to this course.
75% of the students who are enrolled to this course will have a high possibility of having
retention and 25% will still have a chance of having retention to this course.
The graph illustrates us that 77% of the students are having a high probability of having
retention to their course and 23% got still chance of having retention and can still continue
their course throughout the years.

Figure 11. Second semester 2019-2020

The graph entails us that students who are enrolled to CS123 subject 2 nd sem 2019-2020
are most likely to have retention to their course. In this semester there are no students who
drop out, stop or returnee.
The graph illustrates that students who are enrolled to this subject MATH104 are having
a high probability of having retention to their course, also in this semester there are no
students who drop out and stop.
According to this graph it entails us that in CC103 subject 2nd semester of 2019-2020
there are no students who drop out and stop. Student has the high probability of having
retention to their course.

According to this graph students who are enrolled to this subjects has a probability of
having retention to their course also even though there is one student who drop out but still
can have retention to his/her course.
The graph entails us that in 2nd semester 2019-20 CS121 class students have a high
possibility of having retention and can continue their course. Some students got dropped in this
class but it would not affect and they can still have retention.
According to this graph students who are enrolled to this class have a high possibility of
having retention and can continue their course. Even if the student got a 4(INC) grade he/she
can still have retention.

Figure 12. First semester 2020-2021

According to this graph it shows us that 98% of the students have a high probability of
retention even though a student got 6(Deferred Grade) he/she can still have a retention to
his/her course and 2% of the student did not retent to his/her course.
This graph entails us that the students who are enrolled to CS120 1 st sem 2020-2021,
98% have a high chance of having retention and 2% would not be able to have retention to
his/her course.
According to this graph students who are enrolled to this subject CS124 1 st sem 2020-
2021, 98% of the students has a high possibility of having retention and 2% of the students
would not be able to have a retention.

The graoh entails us that students who are enrolled to CS126 1st sem 2020-2021, 98% of
it have a highest probability of having retention to their course and 2% did not retent in this
year.
According to this graph 98% of the students who are enrolled to this subject CS104 1 st
sem 2020-2021 have a high probability of having retention and 2% of the students who did not
have retention to this semester.

Figure 13. Second semester 2020-2021


According to this graph there are 60 students who are enrolled to CS105 2 nd sem
2020-2021. 92% of the students have a high possibility of having retention and 2% of the
students have no retention to this semester.

The graph entails us that 95% of students who are enrolled to CS131 have retention,
four students got INC grade but still have retention to their subject and 5% of students did not
have retention.
According to this graph there are 36 students who are enrolled to this subject CS135
and out of 36 students 94% have retention to their subject and 6% of the students who did not
have retention.
According to this graph there are 36 students who are enrolled to this subject CS137
and out of 36 students 94% have retention to their subject and 6% of the students who did not
have retention.

Figure 14. First semester 2021-2022

The graph entails us that 95% of students who are enrolled to CS134 have retention,
four students got INC grade but still have retention to their subject and 5% of students did not
have retention.
According to this graph 2% of the students who are enrolled to CS136 did not have
retention and 98% of students who have retention to their course.
This graph entails us students who are enrolled to CS130 1 st semester 2020-2021, there
are 37 students who are enrolled to this subject. Out of 37 students, 92% have retention and
8% did not have retention to this subject.

According to this graph it entails us that students who are enrolled to CS132 1st
semester 2020-2021, there are 37 students who are enrolled to this subject. Out of 37 students,
92% have retention and 8% did not have retention to this subject.
Figure 15. Second Semester 2018-2019

According to this graph students who got 2 subjects of INC tends to not have retention
to the next semester, while having 1 subject of INC tends to have retention and can continue to
the next semester.
The graph entails us the computer science students who are enrolled in second
semester year 2018-2019, if a student’s got more than 2 subjects drop he/she tends not to have
retention to the next semester while having 1 drop subject student can still have retention and
continue the semester.
Figure 16. First semester 2019-2020

According to this graph students who are enrolled in 1st semester 2019-2020, students
who got dropped by 3 subjects have a high possibility of not having retention and cannot
continue to next semester, on the other hand student who got 1 or 2 dropped subjects can still
have retention.
According to this graph students who got 1 INC subjects can have a retention and can
still continue their course while there are two students who did not have retent to the next
semester.

The graph entails us that 1st semester of 2019-2020, computer science students who got
1 failed subjects have a high possibility of having retention and can continue to the next
semester while there are some students who did not have retention to this semester.
Figure 17. Second semester 2019-2020

The graph entails us that computer science students who got 1-2 INC subjects can still
have retention to their course and continue to the next semester.
According to the graph second semester of 2019-2020 there are no computer science
students who got drop. Students who got 1-2 dropped subjects can still have a high possibility
of having retention.

In this graph it entails us that there 4 students who got 1 failed subjects in second
semester 2019-2020 but they still have retention and continue to the next semester.
Figure 18. First semester 2020-2021

According to this graph students who got DG grades to 1-2 subjects can still have a high
possibilities of having retention and can continue to the next semester.

Figure 19. Second semester 2020-2021


In this graph it entails us that students who are enrolled in second semester 2020-2021
even though they got 2-3 INC subjects they still have retention and continue the next semester.

In this graph it entails us that there is only one student who got dropped and he/she
having a retention and continuing his/her subjects to the next semesters.
Figure 20. First Semester 2021-2022

According to this graph computer science students who 2 INC subjects can still have
retention and can continue to the next semester while students who got more than 2 INC
subjects did not have retention to this semester and did not continue to the next semester.

Figure 21. Enrollment Records of IT Students


The Illustration shows the enrollment records from 1 st Sem School Year 2018-2019 to 1 st
Sem School Year 2021-2022. The blue represents the number of students enrolled in each
school year and the orange represents those who have drop-out, shifted, and such like
(Currently there is no specific status of what the student is: drop-out or shifted, such as because
of no specific identifier when we have gathered the data).

Figure 22. Second semester 2018-2019

According to this graph there are 94 IT students who are enrolled to 1 st semester 2018-
2019 but in 2nd semester of 2018-2019 there are 73 students who are enrolled and 21 did
stopped in this semester. 22% of the IT students who sis not have retention this semester while
78% is having retention and continuing their subjects.
The graph entails us the number of students who stopped and failed in this semester.
Students who got a failing mark they still have a retention and some did not, he/she got 6-7
marked did not have retention and didn’t continue the next semester.
According to this graph students who are enrolled to IT137 during the second semester
of 2018-2019 there are three students who did not have retention to the next semester while
78% did have a retention and continue the next semesters.

Figure 23. First semester 2019-2020

According to this graph there are 63 students who are enrolled to IT114 during the 1 st
semester of 2019-2020. 67% is having a retention and can continue their next semesters while
33% did not have retention, there are 3 students got a failing marked and did not have
retention.
The graph entails us the IT students who are enrolled during the 1st semester of 2019-
2020, there are two students who dropped to this subject and did not have retention while 67%
of the students is having a high possibility of having retention and continue the next semesters.
According to this graph there are 3 students who dropped in this subject during the 1 st
semester of 2019-2020, 1 got a failing marked and did not have retention while 67% having a
retention, 2 students here got a failing marked but they did have retention and continue their
next semesters.

The graph entails us that 1st semester during of 2019-2020 there is 1 student who
dropped the subject Math103 and did not have retention and there are 4 students is having
retention and continue the next semesters.
According to the graph during the 1st semester of 2019-2020 there are four students who did
not have retention and 67% of the students who are enrolled to this subject IT137 is having a
retention and can continue their next semesters.

Figure 24. Second semester 2019-2020


According to this graph there are no students who did not have retention are enrolled
to Math104 during the 2nd semester of 2019-2020, all of the students who are enrolled to this
semester is having a high possibility of having retention and can continue the next semesters.

In this graph it entails us that there are no IT students who did not have retention. All of
the students who are enrolled to IT123 during the 2nd semester of 2019-2020 have a retention.
According to this graph it entails us that all students who enrolled to CC103 have a high
rate of having retention and can continue to take the next subjects for next semesters. Even
though there is one student who got a failing marked but still he/she did have a retention.

The graph entails us that all of the IT students who are enrolled to IT123 during the
second semester of 2019-2020 have a retention and continuing their next semester. Even
though there is one student who got 4(INC) grade to this subject but still he/she did have
retention.

According to this graph students who are enrolled to IT125 during the second semester
of 2019-2020 have all retentions and can continue their next semesters. Even though there is
two students who got 4(INC) grade to this subject but still he/she can still have retention.
The graph entail us that during the second semester of 2019-2020 students who are
enrolled to IT144 did all have retention and can continue his/her next semester.

Figure 25. First semester 2020-2021


According to this graph during the first semester of 2020-2021 IT students who are
enrolled to CC104 did all have retention even though there is two students who got a
6(Deferred Grade) to this subject but still he/she did still have retention and can continue to the
next semester.

The graph entails us that during this semester all the students who are enrolled to this
subject did all have retention and can continue to the next semester.
This graph entails us that all the students who are enrolled during the 1 st semester of
2020-2021 in IT124 did all have a high possibilities to have retention and can proceed to the
next semester.

According to this graph students who are enrolled to this subject during the 1 st semester
of 2020-2021 did all have retention and can proceed to the next semester.
Figure 26. Second semester 2020-2021

The graph entails us that all the students who are enrolled to IT131 during the 2 nd
semester of 2020-2021 all of the students did have retention even though there are some
students who got 4(INC) grade in this subject but they still have retention and continue the next
semesters.
According to this graph during the 2nd semester of 2020-2021 all of the students of
CC105 got 3 grade and they still have retention to their course.

According to this graph during the 2nd semester of 2020-2021 students who are enrolled
to IT137 there are five students who did not have retention during this semester on the other
hand there are three students who got 4(INC) grade to this subject but they still have retention
and continue to the next semester.
According to this graph all of the IT students who are enrolled to IT133 during
the 2nd semester of 2020-2021 have retention even though two of the students who 4(INC)
grade to this subject but he/she did still have retention.
The graph entails the students who are enrolled to IT135 during the 2nd semester of
2020-2021. There is one student who did not have retention during this semester while the rest
did have retention and there is one student who got 7(Dropped) but still he/she did have
retention during this semester.

Figure 27. First semester 2021-2022

The graph entails us that all the IT students who are enrolled to IT132 during 1 st
semester of 2021-2022 did all have retention, there are some students who got 4(INC) grade to
this subject but they still have retention and continue the next semesters.
According to this graph 64% of the students who are enrolled to IT134 during the 1 st semester
2021-2022 did have retention while 36% did not have retention during this semester.

The graph entails us that 36% of students who are enrolled during this semester did not
have retention, on the other hand 64% did have retention during this semester. Most of
students got 4(INC) grade to this subject but they still have retention and continue to the next
semester.

Figure 28. Second semester 2018-2019

According to this graph during this semester only one student who got 4(INC) and did
still have retention while three students who did not have retention during this semester.
According to this graph during the 2nd semester of 2018-2019 there are three IT students
who failed but did still have retention, on the other hand there are four students who did not
have retention and did not continue the next semesters.
The graph entails us during 2 nd semester of 2018-2019 two students who did have
retention and 22% of students who did not have retention and did not continue during this
semester.

Figure 29. First semester 2019-2020

According to the graph during the 1st semester of 2019-2020, four IT students who got
4(INC) grades, three did have retention while one student didn’t make it – he/she did not have
retention during this semester.
According to this graph during 1st semester of 2019-2020, there are five students who
did have retention while there are four student did not have retention.
The graph entails us during the 1st semester of 2019-2020, three students did not have
retention during this semester, on the other hand students who did not have drop, failed, and
INC grade during this semester did have retention and continue the next semesters.

Figure 30. Second semester 2019-2020

The graph entails us that during the 2nd semester of 2019-2020, all of the IT students did
have retention and no students who did not have retention during this semester.
According to this graph during the 2nd semester of 2019-2020, only one student who got
failing marked but still he/she did have retention during this semester and can continue the
next semester.

Figure 31. First semester 2020-2021


According to the graph it entails us that during the 1st semester of 2021-2022, 36% of
the IT students who are enrolled during this semester did not have retention on the other hand
64% did have retention during this semester and continue the next semesters.

Figure 32. Second semester 2020-2021

The graph entails us that during this semester there is only one student who dropped
but still he/she did have a retention during this school year and continue the next semester.
According to this graph during the 2nd semester of 2020-2021 the graph entails us that
these are the numbers of students who got 4(INC) grades during this semester but/she did still
have retention and continue the next semesters.
Chapter V

5.1 Conclusion and Recommendation

5.1.1 Conclusion

One of our objective is to predict the retention of the Students of Institute


of Computer Studies after the researchers gathered and analyze the data and
implement a system that will provide 75% accuracy. The researchers able to implement
a system that provides reliable results that you can use using any internet provider. The
researchers also implemented in which the user can view the inputted records through
data table and can export the data, recommendations or tips about the result whether
the students is having a high possibility of retention or likely to fail the subjects to the
course that they are enrolled to. The system also have the feature that lets the users to
search the specific data through the search bar. The researchers conclude that the
mentioned issue was able to give a solution in this research.

5.1.2 Recommendation

For the future recommendation;

 Upgrade the scope in which other colleges or universities can use


the system
 Add variables like the income of the students’ parents, sex, grants,
full-time or part time worker
References

(n.d.).

4 Simple Steps to do Qualitative Analysis. (2018, August 16). Retrieved from bangthetable.com:
https://www.bangthetable.com/blog/4-simple-ways-to-do-qualitative-analysis/

Allen, E., & Seaman, J. (2008, November). Staying the Course: Online Education in the United
States, 2008. United States of America: Sloan-C. Retrieved from ERIC:
https://eric.ed.gov/?id=ED529698

Bautista, L. M. (2020, March 25). CNN Philippines. Retrieved from CNN Philippines:
https://www.cnnphilippines.com/regional/2020/3/25/Zamboanga-City-reports-first-
COVID-19-case-.html

Boston, W. E., Ice, P., & Gibson, A. M. (2015, January 9). Comprehensive Assessment of Student
Retention in Online Learning Environments. Retrieved from ResearchGate:
https://www.researchgate.net/publication/264838853

Cennimo, D. J. (2021, April 19). Coronavirus Disease 2019 (COVID-19 Q&A). Retrieved from
Medscape: https://www.medscape.com/answers/2500114-197402/how-did-the-
coronavirus-outbreak-start

Dietz, B., Han, A., & Fisher, A. (2019, June 13). Designing Online Courses to Promote Student
Retention. Retrieved from ResearchGate:
https://www.researchgate.net/publication/249233937

Glen, S. (n.d.). Correlation Coefficient: Simple Definition, Formula, Easy Steps. Retrieved from
statisticshowto: https://www.statisticshowto.com/probability-and-statistics/correlation-
coefficient-formula/

Herbert, M. (2006). Staying the Course: A Study in Online Student Satisfaction and Retention.
Retrieved from CiteSeer: https://citeseerx.ist.psu.edu/viewdoc/summary?
doi=10.1.1.566.1253&rank=1&q=Staying%20the%20Course:%20A%20Study%20in
%20Online%20Student%20Satisfaction%20and%20Retention&osm=&ossid=

Kandola, A. (2020, June 30). Coronavirus cause:Origin and how it spreads. Retrieved from
Medical News Today: https://www.medicalnewstoday.com/articles/coronavirus-causes

KHUSHIS. (2021, April 19). Exploratory Analysis Using Univariate, Bivariate, and Multivariate
Analysis Techniques. Retrieved from analyticsvidhya:
https://cdn.analyticsvidhya.com/wp-content/uploads/2021/04/image10.png

KHUSHIS. (2021, April 19). Exploratory Analysis Using Univariate, Bivariate, and Multivariate
Analysis Techniques. Retrieved from analyticsvidhya:
https://cdn.analyticsvidhya.com/wp-content/uploads/2021/04/image11.png

Laato, S., Lipponen, E., Vilppu, H., Murtonen, M., & Lehtinen, E. (2020, November 11).
Retention of University Teachers and Doctoral Students in UNIPS Pedagogical. Retrieved
from researchgate: : https://www.researchgate.net/publication/345718962

Murphy, L., Eduljee, N. B., & Croteau, K. (2020, July 17). College Student Transition to
Synchronous Virtual Classes during. Retrieved from pedagogicalresearch:
https://www.pedagogicalresearch.com/article/college-student-transition-to-
synchronous-virtual-classes-during-the-covid-19-pandemic-in-8485

Oussena, S., Kim, H., & Clark, T. (2014, May 22). Use Data Mining to Improve Student Retention
in Higher Education - A Case. Retrieved from researchgate:
https://www.researchgate.net/publication/220708516_Use_Data_Mining_to_Improve_
Student_Retention_in_Higher_Education_-_A_Case_Study

PMC. (2020, April 14). Retrieved from NCBI:


https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7154063/#:~:text=On%20January
%2030%2C%20SARS%2DCoV,fever%2C%20cough%2C%20and%20chills.

Poll, K., Widen, J., & Weller, S. (2014, May 1). Six Instructional Best Practices for Online
Engagement and Retention. Retrieved from Loyola eCommons:
https://ecommons.luc.edu/english_facpubs/30/

Scott, J., Swan, K., & Daston, C. (2016, June). Retention, Progression and the Taking of Online
Courses. Retrieved from ResearchGate:
https://www.researchgate.net/publication/305266527_Retention_Progression_and_the
_Taking_of_Online_Courses
Appendix
Appendix A

You might also like