Professional Documents
Culture Documents
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE WITH SPECIALIZATION IN
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
Submitted by:
21BCS10293 Dhruv Aggarwal
1
September, 2022
Abstract
The persistence of high dropout rates in educational institutions is a major topic in the
modern educational landscape. The goal of this project is to use machine learning to
develop a strong and practical solution for enhancing learning processes and decreasing
dropout rates.
This research attempts to identify students at risk of dropping out by using a thorough
methodology that includes data gathering from various sources, data preparation, and the
construction of powerful machine learning models. The algorithms will help educators and
administrators proactively address the issue by not only forecasting dropout risk but also
by offering insightful information and remedies.
Plans for this project include personalizing learning pathways, integrating with learning
management systems, and addressing the difficulties of online and distant learning as
education develops. Furthermore, the project emphasizes privacy, justice, and openness
while acknowledging the ethical issues of AI in education.
2
Table of Contents
Title Page i
Abstract ii
1. Introduction
1.1 Problem Definition
1.2 Project Overview
1.3 Hardware Specification
1.4 Software Specification
2. Literature Survey
2.1 Existing System
2.2 Proposed System
2.3 Literature Review Summary
3. Problem Formulation
4. Research Objective
5. Methodologies
6. Conclusion
7. Tentative Chapter Plan for the proposed work
8. Reference
3
1. INTRODUCTION
1.1 Problem Definition
High dropout rates in educational institutions, ranging from elementary and secondary schools to higher
education institutions, continue to be a chronic burden.
High dropout rates disrupt the flow of education and frequently result in inadequate qualifications,
restricting affected individuals' future employment options. Due to the fact that dropouts often have lower
earning potential and a larger chance of unemployment, this issue contributes to cycles of poverty and
social injustice.
This includes educational disruption, academic and personal factors, lack of early intervention. All these are
some of the factors that should addressed.
1
2. LITERATURE SURVEY
A convolutional neural network model FWTS-CNN that integrates feature weighting and
behavioral time series. It extracts continuous behavioral features from the learner's log of
learning activities, filters key features and ranks them by importance based on the decision
tree, then weights the continuous behavioral features based on importance, and finally builds
a convolutional neural network model based on behavioral time series and weighted features.
Neural Networks:
In the proposed stacking ensemble, the raw data with messy and irregular features will be
processed through multiple classification models, and valid features will be extracted.
Stacking's learning ability stems primarily from the representation of features, which is
consistent with the structure of neural networks (NN). The first layer in Stacking is analogous
to the first N-1 layer in a NN, while the second layer in stacking is analogous to the last
output layer in a NN.
In this project we are making a hybrid model that will predict, recommend and document the progress of
the student. It will provide a user-friendly Interface to help the educators and the students to access it with
ease.
2
2.3 Literature Review Summary
Year and Tools/ Evaluation
Article/ Author Technique Source
Citation Software Parameter
Rodríguez, P.,
Villanueva, A.,
Dombrovskaia, L.
and Valenzuela,
J.P., 2023. A
methodology to
design, develop, and SHAP
evaluate machine (SHapley Google
learning models for Patricio Additive Scholar
predicting dropout Rodríguez, exPlanations)
in school systems: Alexis
the case of Chile. Villanueva,
Education and Lioubov
Information Dombrovskaia &
Technologies, pp.1- Juan Pablo
47. Valenzuela Pythin
3
using two-layer
ensemble machine
learning approach:
A novel stacked Boosting
generalization. (GB), and
Computers and Feed-
Education: forward
Artificial Neural
Intelligence, 3, Networks
p.100066. (FNN)
4
3. PROBLEM FORMULATION
The main goal of the problem formulation is to provide a machine learning solution that addresses
the issue of high student dropout rates in educational institutions by using data-driven insights. This
includes gathering data, preparing it, developing a model, testing it, deploying it, interpreting it, and
making continual improvements while upholding moral norms and keeping records for transparency.
Data Collection:
Collect comprehensive data from various sources, including student demographics, academic
performance records, socioeconomic indicators, behavioral data, historical dropout data, and
instructor evaluations.
Data Preprocessing:
Process and prepare the collected data by addressing missing values, outliers, and encoding
categorical variables. This step ensures that the data is suitable for machine learning analysis.
Model Development:
Develop machine learning models that can perform the following tasks:
Classification:
Predict which students are at risk of dropping out based on available data.
Regression:
Identify factors and patterns contributing to dropout rates.
Time-Series Analysis:
Analyze temporal trends and patterns related to dropout.
5
Model Evaluation:
Evaluate the performance of the machine learning models using appropriate evaluation metrics, such
as accuracy, precision, recall, F1-score, and ROC AUC. Employ cross-validation techniques to
ensure model robustness.
Deployment:
Implement the machine learning model into an accessible platform or dashboard that allows
educators and administrators to input student data and receive real-time predictions and
recommendations.
Interpretability:
Apply interpretability techniques to make the model's predictions and recommendations
understandable to end-users, fostering trust and usability.
Continuous Improvement:
Create provisions for continuous improvement, allowing the model to adapt and evolve as new data
becomes available. Regular updates and retraining should be scheduled to maintain model
effectiveness.
Documentation:
Provide comprehensive documentation for all project phases, including data collection,
preprocessing, model development, and deployment, to enable transparency and knowledge transfer.
6
4. OBJECTIVES
Utilising machine learning techniques to address the crucial problem of high dropout
rates in educational institutions is the project's main goal. It seeks to forecast the
dropout rate and gain useful information. Decisions will be made based on the
information, enhancing the learning process. Real-time intervention will be made. Its
user interface will be friendly.
5. METHODOLOGY
Data Collection:
Collect comprehensive data from various sources, including student demographics, academic performance
records, socioeconomic indicators, behavioral data, historical dropout data, and instructor evaluations.
Data Preprocessing:
Process and prepare the collected data by addressing missing values, outliers, and encoding categorical
variables. This step ensures that the data is suitable for machine learning analysis.
Model Development:
Develop machine learning models that can perform the following tasks:
Classification:
Predict which students are at risk of dropping out based on available data.
7
Regression:
Time-Series Analysis:
Model Evaluation:
Evaluate the performance of the machine learning models using appropriate evaluation metrics, such as
accuracy, precision, recall, F1-score, and ROC AUC. Employ cross-validation techniques to ensure model
robustness.
Deployment:
Implement the machine learning model into an accessible platform or dashboard that allows educators and
administrators to input student data and receive real-time predictions and recommendations.
Interpretability:
Apply interpretability techniques to make the model's predictions and recommendations understandable to
end-users, fostering trust and usability.
Develop a user-friendly interface that facilitates easy interaction with the model for educators and
administrators, enabling them to access predictions and insights effortlessly.
8
Continuous Improvement:
Create provisions for continuous improvement, allowing the model to adapt and evolve as new data
becomes available. Regular updates and retraining should be scheduled to maintain model effectiveness.
Documentation:
Provide comprehensive documentation for all project phases, including data collection, preprocessing,
model development, and deployment, to enable transparency and knowledge transfer
6.CONCLUSION
Finally, the project is an important step in tackling the pressing problem of high student dropout
rates in educational institutions. This study has successfully created prediction models that can
recognise at-risk pupils and offer useful information for educators and administrators by utilising
machine learning approaches. The initiative helps educational institutions to make data-driven
decisions and also aspires to create a more encouraging and inclusive learning environment through
a user-friendly interface, real-time interventions, and a dedication to ethical issues. This initiative
aims to improve the educational experience for students from all backgrounds and institutions while
reducing dropout rates, with opportunities for ongoing improvement and a commitment to
transparency.
9
8. TENTATIVE CHAPTER PLAN FOR THE PROPOSED WORK
CHAPTER 1: INTRODUCTION
The use of technology in classrooms has grown during the past several years in the context of
education. Although there have been improvements, dropout rates in educational institutions
continue to be a major concern. This initiative seeks to use machine learning (ML) methods to
enhance the educational experience and, ultimately, lower dropout rates.
Many research has been done on the topic, which helps us to improve the dropout rate, neural
network model FWTS-CNN, Hybrid of Random Forest (RF), and many more
CHAPTER 3: OBJECTIVE
CHAPTER 4: METHODOLOGIES
Data collection
Data preprocessing
Model development
Model Evaluation
Deployment
10
Interpretability
User-Interface
Documentation
Finally, the project is an important step in tackling the pressing problem of high student dropout
rates in educational institutions. This study has successfully created prediction models that can
recognise at-risk pupils and offer useful information for educators and administrators by utilising
machine learning approaches. The initiative helps educational institutions to make data-driven
decisions and also aspires to create a more encouraging and inclusive learning environment
through a user-friendly interface, real-time interventions, and a dedication to ethical issues. This
initiative aims to improve the educational experience for students from all backgrounds and
institutions while reducing dropout rates, with opportunities for ongoing improvement and a
commitment to transparency.
REFERENCES
1. Rodríguez, P., Villanueva, A., Dombrovskaia, L. and Valenzuela, J.P., 2023. A methodology to
design, develop, and evaluate machine learning models for predicting dropout in school systems:
the case of Chile. Education and Information Technologies, pp.1-47.
2. Zheng, Y., Gao, Z., Wang, Y. and Fu, Q., 2020. MOOC dropout prediction using FWTS-CNN
model based on fused feature weighting and time series. IEEE Access, 8, pp.225324-225335.
3. Niyogisubizo, J., Liao, L., Nziyumva, E., Murwanashyaka, E. and Nshimyumukiza, P.C., 2022.
Predicting student's dropout in university classes using two-layer ensemble machine learning
11
approach: A novel stacked generalization. Computers and Education: Artificial Intelligence, 3,
p.100066.
4. Tan, M. and Shao, P., 2015. Prediction of student dropout in e-Learning program through the use of
machine learning method. International journal of emerging technologies in learning, 10(1).
5. Tamada, M.M., de Magalhães Netto, J.F. and de Lima, D.P.R., 2019, October. Predicting and
reducing dropout in virtual learning using machine learning techniques: A systematic review. In
2019 IEEE Frontiers in Education Conference (FIE) (pp. 1-9). IEEE.
12