Ara Kau Sand Raj

VISVESWARAYA TECHNOLOGICAL UNIVERSITY Jnana Sangama, Belagavi-590018
A Project Work Phase-1 Intermediate Report on

“ALZHEIMERS DISEASE DIAGNOSIS USING DEEP LEARNING APPROACH AND COGNITIVE TESTING ”
Submitted in partial fulfillment of the requirements for the award of the degree of
BACHELOR OF ENGINEERING
In
Department of Artificial Intelligence and Machine Learning
Submitted By
Aravind Suresh 1BY20AI009
M S Kaushik 1BY20AI024
Sandeep Arockia Samraj X 1BY20AI044
Raj Powell 1BY21AI402
Under The Guidance Of

DR. CHANDRASHEKHAR B N
Assistant Professor
Dept. of AI&ML
BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
(Autonomous Institute Affiliated to VTU)
Avalahalli, Yelahanka, Bengaluru – 560064.
2023-2024 (Odd Sem)
VISVESWARAYA TECHNOLOGICAL UNIVERSITY Jnana Sangama, Belagavi-590018

BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT
(Autonomous Institute Affiliated to VTU)

Avalahalli, Yelahanka, Bengaluru – 560064.
CERTIFICATE
Alzheimers Disease Diagnosis Using Deep Learning Approach and Cognitive Testing 2
This is to certify that the Project Work Phase-1 entitled “ALZHEIMERS DISEASE DIAGNOSIS USING DEEP LEARNING APPROACH AND COGNITIVE
TESTING” has been carried out by Aravind Suresh 1BY20AI009, M S Kaushik 1BY20AI024, Sandeep Arockia Samraj X 1BY20AI009, Raj Powell 1BY21AI402
a bonafide student of BMS Institute of Technology and Management, Autonomous Institute Affiliated to VTU, in partial fulfillment for VII semester B.E project work in
the Department of Artificial Intelligence and Machine Learning in Visvesvaraya Technological University, Belagavi , during the academic year 2023-2024 (odd
sem). It is certified that all corrections/suggestions indicated for assessment have been incorporated in the report deposited in department library. The project report has
been approved as it satisfies the academic requirements in respect of Project Work Phase-1(18AIP77) work prescribed for the said degree.
--------------------- ----------------------- ---------------------

Dr Chandrashekhar B N SPARC Members Dr. Anupama H S,
Assistant Professor AI&ML, BMSIT&M Prof. & HOD, AI&ML, BMSIT&M

AI&ML, BMSIT&M
Name and USN Signature
A S Adithiyaa (1BY20AI001) ………………….
Abhishek S T (1BY20AI003) ………………….
Manish A S (1BY20AI027) ………………….
N Sahana (1BY20AI029) ………………….
DECLARATION
We, Aravind Suresh [1BY20AI009], M S Kaushik [1BY20AI024], Sandeep Arockia Samraj [1BY20AI044], Raj Powell [1BY21AI402] student of VII Semester
BE, in the department of Artificial Intelligence and Machine Learning, BMS Institute of Technology and Management, Autonomous Institute Affiliated to VTU,
hereby declare that the Project Work Phase-1 (18AIP77) entitled “ALZHEIMERS DISEASE DIAGNOSIS USING DEEP LEARNING APPROACH AND
COGNITIVE TESTING” has been carried out by us and submitted in partial fulfillment of the requirements for the Bachelor of Engineering in the Department of
Artificial Intelligence and Machine Learning during academic year 2023-24.
Date:
Place: Yelahanka, Bangalore

Name and USN Signature
A S Adithiyaa (1BY20AI001) ………………….
Abhishek S T (1BY20AI003) ………………….
Manish A S (1BY20AI027) ………………….
N Sahana (1BY20AI029) ………………….
BMS INSTITUTE OF TECHNOLOGY, BANGALORE-560064
Students Project Review and Assessment Committee
Intermediate Report-Phase I
BMSIT&M, AIML 2023

Batch No: Guide Name: Submission Date:

5 Dr. Chandrashekhar B N
Personalized Adaptive Learning Platform Empowered by Artificial Intelligence
Sl No USN Name
1 1BY20AI009 Aravind Suresh
2 1BY20AI024 M S Kaushik
3 1BY20AI044 Sandeep Arockia Samraj X
4 1BY21AI402 Raj Powell
Project Execution Place BMS Institute of Technology and Management, Yelahanka, Bengaluru
Project Category Application Project
Signature of HoD Signature of the Guide SPARC
BMS INSTITUTE OF TECHNOLOGY & MANAGEMENT
Yelahanka, Bangalore – 560 064
Department of Artificial Intelligence and Ma- chine Learning
Synopsis for the Project work
BMSIT&M, AIML 2023

“ALZHEIMERS DISEASE DIAGNOSIS USING DEEP LEARNING APPROACH AND COGNITIVE TESTING”
Submitted By:
Aravind Suresh, 1BY20AI009
M S Kaushik, 1BY20AI024
Sandeep A X, 1BY20AI044
Raj Powell, 1BY21AI402
Under the Guidance of
Dr. Chandrashekhar B N
2023-2024
ACKNOWLEDGEMENT
We are happy to present this Project work Phase-1 (18AIP77) report after completing it successfully. This final year project would not have been possible without the
guidance, assistance and suggestions of many individuals. We would like to express our deep sense of gratitude and indebtedness to each and every one who has helped
us make this a success.
We heartily thank our Principal, Dr. MOHAN BABU. G.N, BMS Institute of Technology & Management, Autonomous Institute Affiliated to VTU for his constant en -
couragement and inspiration in taking up this Project Work Phase-1.
We heartily thank our Professor & Head of the Department, Dr. Anupama H S, Department of Artificial Intelligence and Machine Learning, BMS Institute of Techno-
logy & Management, Autonomous Institute Affiliated to VTU, for her constant encouragement and inspiration in taking up this Project Work Phase-1.
We gracefully thank our Project Work Phase-1 Guide, Dr. Chandrashekhar B N, Assistant Professor, Dept. of AI&ML for his guidance, support and advice.
We also like to thank our Student project Assessment and Review Committee (SPARC) Member’s Dr. Bharathi Malakreddy A, Dr. Niranjanamurthy M and
Dr. Chandrashekhar B N for their help and support provided to carry out the project and complete it successfully.
BMSIT&M, AIML 2023

Special thanks to all the staff members of Artificial Intelligence and Machine Learning department for their help and kind co- operation.
Lastly, We thank our parents and friends for the support and encouragement given to us in completing this Project Work Phase-1 successfully.
II
ABSTRACT
This project aims to develop an advanced predictive model for the early detection of Alzheimer's disease through the integration of machine learning techniques and cog -
nitive testing. Alzheimer's disease is a complex neurodegenerative disorder, and early diagnosis is crucial for timely intervention and improved patient outcomes. Lever-
aging a diverse dataset consisting of cognitive test results and neuroimaging data, our approach combines the strengths of traditional cognitive assessments with machine
learning algorithms.
The project begins with the collection and preprocessing of a comprehensive dataset, including cognitive test scores and relevant demographic information. Neuroimag -
ing data, such as MRI scans, further enriches the dataset, providing valuable insights into brain structure and function. To enhance the predictive power of the model, ad -
vanced machine learning algorithms, including Convolutional Neural Networks (CNNs) and ensemble methods, are employed. CNNs are specifically tailored for pro -
cessing spatial information in neuroimaging data, allowing for the extraction of intricate patterns associated with Alzheimer's disease.
The integration of cognitive testing, a widely used diagnostic tool, ensures a holistic approach to Alzheimer's prediction. Machine learning models are trained on this di -
verse dataset to recognize subtle cognitive patterns and neuroimaging biomarkers indicative of early-stage Alzheimer's. The project's innovative framework not only en -
hances accuracy but also enables interpretable insights into the cognitive features contributing to the model's predictions
III
TABLE OF CONTENTS
CHAPTER NO. TITLE PAGE NO.
ABSTRACT 7
1 INTRODUCTION 10
1.1 Overview 10
1.2 Purpose of the problem 11
1.3 Scope of the project 11
2 PROBLEM DEFINITION 13
2.1 Problem statement 13
3 LITERATURE SURVEY 14
3.1 "Heart Disease Prediction using Random Forest Classifier" 14
3.2 "Heart Disease Prediction using Machine Learning" 15
BMSIT&M, AIML 2023

3.3 "Heart Disease Prediction using Machine Learning Algorithms" 16
3.4 “Heart Disease Prediction System” 17
3.5 "Cardiovascular Disease Detection using Ensemble Learning" 18
3.6 "Significance of Visible Non-Invasive Risk Attributes for the Initial Prediction of Heart Disease 19
Using Different Machine Learning Techniques"
3.7 "Machine Learning Prediction in Cardiovascular Diseases: A Meta-Analysis" 20
3.8 "An Artificial Intelligence Model for Heart Disease Detection using Machine Learning Algo- 21
rithms"
3.9 "Using Machine Learning for Predicting CVDs" 22
3.10 "Early Heart Disease Detection Using Machine Learning" 23
3.11 “Heart disease prediction using supervised machine learning algorithms: Performance analysis 24
and comparison
3.12 “Effectively Predicting the Presence of Coronary Heart Disease Using Machine Learning Classi- 25
fiers”
3.13 “Cardiac disease prediction using AI algorithms with SelectKBest” 26
3.14 “Study of cardiovascular disease prediction model based on random forest in eastern China” 27
4 REFERENCES 28
IV
LIST OF FIGURES
FIGURE NAME PAGE NO.
Machine Learning 11
Healthy V/S Diseased heart 12
BMSIT&M, AIML 2023

CHAPTER 1 INTRODUCTION
1.1 Overview
Alzheimer's disease, a prevalent neurodegenerative disorder, poses significant challenges for early diagnosis and intervention. This project aims to develop a predictive model using ma-
chine learning (ML) and cognitive testing to detect Alzheimer's at an early stage. By integrating cognitive assessments and advanced ML algorithms, the project seeks to improve accu-
racy and provide interpretable insights into the disease.
1.2 Purpose of the problem
The primary goal is to enhance Alzheimer's detection through a comprehensive approach that combines traditional cognitive assessments with state-of-the-art ML techniques. Early
diagnosis enables timely interventions, potentially improving patient outcomes and facilitating personalized treatment plans.
1.3 Scope of the project
The project scope encompasses the development of a predictive model leveraging a diverse dataset of cognitive test results and neuroimaging data. The integration of Convolutional
Neural Networks (CNNs) and ensemble methods enhances the model's capability to recognize subtle patterns associated with Alzheimer's.
1.4 Definitions
Alzheimer's Disease: A progressive neurodegenerative disorder affecting memory, thinking, and behavior.
Cognitive Testing: Assessments evaluating cognitive abilities, often used in Alzheimer's diagnosis.
Convolutional Neural Networks (CNNs): Deep learning models designed for spatial data, suitable for neuroimaging analysis.
Ensemble Methods: Techniques combining multiple models to improve overall performance.
CHAPTER 2 PROBLEM DEFINITION

2.1 Problem statement
The problem of Alzheimer's diagnosis has been extensively studied, with conventional testing being a conventional approach. However, the limitations of traditional
methods necessitate the integration of ML for enhanced accuracy.
The project addresses the need for accurate and early prediction of Alzheimer's disease and cognitive decline, utilizing deep learning algorithms and
cognitive testing data to enhance diagnostic precision and facilitate timely interventions for improved patient care.
2.2 Objectives of the proposed project
BMSIT&M, AIML 2023

CHAPTER 3 LITERATURE SURVEY
3.1 Heart Disease Prediction using Random Forest Classifier [April 2020] [1]
Summary: The paper titled "Heart Disease Prediction using Random Forest Classifier" (April 2020) addresses the critical task of predicting the likelihood of heart dis -
ease. The authors employ the UCI machine learning dataset, comprising 14 attributes like age, sex, RestingBP, among others. Through meticulous preprocessing of the
dataset, the information is then input into a Random Forest Classifier—a robust machine learning algorithm recognized for its versatility in classification tasks.
The primary objective is to classify individuals into categories of high or low risk for heart disease based on the provided attributes. To evaluate the model's perform-
ance, the authors employ several metrics, including Precision, Recall, F1 score, and Confusion matrix. These metrics offer a comprehensive assessment of the model's
accuracy and its ability to correctly identify instances of heart disease.
Remarkably, the Random Forest Classifier, implemented using Python, demonstrates a commendable accuracy of 90.16%. This high accuracy suggests the effectiveness
of the chosen methodology in predicting heart disease risk based on the selected attributes. The utilization of Python for implementation implies the application of a
widely-used programming language for machine learning tasks.
Despite the overall success of the model, the paper acknowledges potential challenges, notably related to the dataset. Issues such as missing values or outliers might pose
a threat to the model's accuracy and its ability to generalize to unseen data. Acknowledging these challenges is crucial for future improvements and enhancements to the
predictive model.
Technology/Methodology used: Random Forest Classifier implemented using Python
Issues faced: One potential issue might be related to the dataset itself, such as missing values or outliers that could impact the accuracy and generalizability of the model.
BMSIT&M, AIML 2023

3.2 Heart disease prediction using Machine learning [May 2021] [2]
Summary: The paper titled "Heart Disease Prediction using Machine Learning" (May 2021) delves into a comprehensive analysis of the performance of various
machine learning models for predicting heart disease. The research is centered on identifying the most effective algorithm for a dataset encompassing crucial heart-
related attributes, including blood pressure and the number of heartbeats per minute. The UCI ML Cleveland dataset, featuring 14 such attributes, serves as the
foundation for their investigation.
The study evaluates the efficacy of different machine learning models, specifically the Random Forest Classifier, Support Vector Machine (SVM), and Artificial Neural
Network (ANN). Each model is trained and tested on the same dataset to gauge its accuracy in predicting heart disease risk. Notably, the Random Forest Classifier
emerges as the most accurate among the tested models, attributed to its inherent non-overfitting capabilities.
The technology and methodology employed involve the implementation of Random Forest Classifier, Support Vector Machine, and Artificial Neural Network using
Python. Python's versatility in machine learning applications makes it a suitable choice for implementing and assessing the performance of these models.
However, the paper acknowledges potential challenges in the form of overfitting, particularly emphasizing the need for careful consideration of model parameters to
prevent overfitting issues that could compromise accuracy and generalizability. Overfitting occurs when a model learns the training data too well, capturing noise and
hindering its ability to perform well on new, unseen data.
Technology/Methodology used: Random Forest classifier, Support Vector Machine, Artificial Neural Network implemented using Python.
Issues faced: Overfitting might be an issue with certain models or parameters within the algorithms chosen, affecting the accuracy and generalizability of the models.
3.3 Heart disease prediction using machine learning algorithms [October 2020] [3]
Summary: The paper titled "Heart Disease Prediction using Machine Learning Algorithms" (October 2020) responds to the escalating concern surrounding the
prevalence of heart disease by introducing a predictive system. The study employs machine learning algorithms, specifically logistic regression and K-Nearest Neighbors
(KNN), to evaluate a patient's susceptibility to heart disease based on their medical history. This approach represents a notable advancement over previous classifiers,
such as naive Bayes, demonstrating enhanced prediction accuracy.
The heart of the research lies in the implementation of logistic regression and KNN using Python. Python's suitability for machine learning applications ensures efficient
execution and facilitates the assessment of these algorithms in the context of heart disease prediction. Logistic regression is known for its simplicity and interpretability,
while KNN relies on the proximity of data points to make predictions.
Remarkably, the findings indicate that KNN outperforms other algorithms, achieving an accuracy of 88.52%. This accuracy underscores the efficacy of KNN in
discerning patterns within the medical history data, emphasizing its potential as a valuable tool in predicting heart disease risk.
However, the paper acknowledges potential challenges stemming from the choice of hyperparameters in both KNN and logistic regression. Issues such as suboptimal
model performance or underfitting may arise if the hyperparameters are not appropriately tuned. Addressing these challenges becomes crucial for ensuring the reliability
and generalizability of the predictive model to unseen data.
Technology/Methodology used: Logistic regression and KNN implemented using Python
BMSIT&M, AIML 2023

Issues faced: Potential issues could arise from the choice of hyperparameters in KNN or logistic regression leading to underfitting or suboptimal model performance.
3.4 Heart Disease Prediction System [March 2020] [4]
Summary: The paper titled "Heart Disease Prediction System" (March 2020) introduces a comprehensive Heart Disease Prediction System (HDPS) developed through
data mining techniques, specifically employing Naive Bayes and Decision Tree algorithms. The study focuses on utilizing 15 medical parameters, including age, sex,
blood pressure, cholesterol, and obesity, to effectively predict the risk of heart disease. This approach aims to establish meaningful knowledge and relationships between
various medical factors and patterns associated with heart disease.
The HDPS is designed to be a diagnostic tool providing valuable insights for healthcare decision-making. By leveraging data mining algorithms, it contributes to the
understanding of complex relationships within medical datasets, enhancing the accuracy of predictions related to heart disease risk. The Decision Tree algorithm, known
for its ability to represent decision rules graphically, is particularly valuable in deciphering patterns within the dataset.
The results of the study affirm the effectiveness of the HDPS in predicting the risk level of heart disease. The insights derived from the system have the potential to
significantly impact healthcare decision-making processes, enabling proactive measures for individuals identified at higher risk.
In terms of technology and methodology, the implementation of Naive Bayes and Decision Tree algorithms is carried out using Python. Python's versatility in data
analysis and machine learning applications facilitates the effective development and assessment of the HDPS.
Technology/Methodology used: Naïve Bayes and Decision Tree implemented using Python
Issues faced: Potential issues could relate to the assumptions made by Naïve Bayes regarding independence among features, which might not hold true in this medical
context, affecting the predictive performance.
3.5 Cardiovascular Disease Detection using Ensemble Learning [August 2022] [5]
Summary: The paper titled "Cardiovascular Disease Detection using Ensemble Learning" (August 2022) introduces an innovative approach to predict the risk of
cardiovascular disease through the utilization of ensemble learning, incorporating both machine learning (ML) and deep learning (DL) models. The study employs six
categorization methods to forecast cardiovascular disease, showcasing a comprehensive and integrative approach to enhance prediction accuracy.
The technology and methodology employed involve four prior ML classifiers, namely Random Forest (RF), K-Nearest Neighbors (KNN), Decision Tree (DT), and
Extreme Gradient Boosting (XGBoost). This ensemble strategy harnesses the strengths of each classifier, providing a more robust and nuanced prediction model. The
inclusion of ensemble learning is a noteworthy aspect as it seeks to capitalize on the complementary strengths of diverse algorithms.
The experimental findings reveal that the ML ensemble model achieves a commendable illness prediction accuracy of 88.70%. This result underscores the effectiveness
of the ensemble approach in improving predictive performance compared to individual classifiers, showcasing the potential of combining diverse models for more
accurate risk assessment.
BMSIT&M, AIML 2023

However, the paper acknowledges potential challenges associated with ensemble learning, emphasizing the importance of selecting optimal strategies to ensure synergy
among the classifiers. Fine-tuning hyperparameters is also recognized as a critical consideration to prevent overfitting while maximizing overall accuracy. These
challenges highlight the need for careful optimization to harness the full potential of ensemble learning in cardiovascular disease prediction.
Technology/Methodology used: Four prior ML classifiers are used: RF, KNN, DT, and Extreme Gradient Boosting
Issues faced: Potential challenges might include the selection of optimal ensemble strategies and fine-tuning hyperparameters to prevent overfitting while maximizing accuracy.
3.6 Significance of Visible Non-Invasive Risk Attributes for the Initial Prediction of Heart Disease Using Different Machine Learning Techniques
[February 2022] [6]
Summary: The paper titled "Significance of Visible Non-Invasive Risk Attributes for the Initial Prediction of Heart Disease Using Different Machine Learning
Techniques" (February 2022) responds to the escalating heart disease mortality rate by proposing an effective, low-cost, and reliable heart disease risk evaluation model.
The research focuses on leveraging significant non-invasive risk attributes identified with the assistance of medical domain experts. These attributes include age, systolic
blood pressure, diastolic blood pressure, BMI, hereditary factors, smoking, alcohol consumption, and physical inactivity. The study aims to assess the reliability of these
attributes in predicting heart disease through various feature selection techniques.
The technology and methodology involve the application of specific investigated techniques, including Random Forest, Naïve Bayes, Decision Tree, Support Vector
Machine (SVM), and K-Nearest Neighbors (KNN). The implementation is carried out using the Jupyter Notebook web application, providing a versatile and interactive
environment for developing and testing machine learning models.
The paper emphasizes the potential issues associated with the interpretability of selected attributes. While these non-invasive risk factors are considered significant, their
interpretability and real-world applicability may pose challenges that need to be carefully addressed. Additionally, the paper acknowledges the possibility of bias in the
dataset, which could impact the robustness of the predictive model. Addressing potential biases becomes crucial for ensuring the model's reliability and generalizability
to diverse populations.
Technology/Methodology used: Applying specific investigated techniques like random forest, Naïve Bayes, decision tree, support vector machine, and K nearest
neighbor developed using the Jupyter Notebook web application
Issues faced: Potential issues could involve the interpretability of the selected attributes and the potential bias in the dataset, affecting the robustness of the predictive
model.
3.7 Machine learning prediction in cardiovascular diseases: a meta-analysis [September 2020] [7]
Summary: The paper titled "Machine Learning Prediction in Cardiovascular Diseases: A Meta-Analysis" (September 2020) conducts a thorough review of 344 studies,
exploring the efficacy of machine learning algorithms in predicting cardiovascular diseases. The meta-analysis identifies promising algorithms, with a particular
emphasis on Support Vector Machines (SVM) and boosting algorithms, showcasing their potential in accurate disease prediction. However, the review acknowledges the
existence of heterogeneity among the various algorithms, underscoring the necessity for thoughtful selection based on the characteristics of specific datasets.
BMSIT&M, AIML 2023

The technology and methodology encompass the implementation of SVM, Convolutional Neural Networks (CNN), and boosting algorithms, executed using Python.
Python's versatility in machine learning applications facilitates the systematic analysis of diverse algorithms, contributing to the comprehensiveness of the meta-analysis.
While the findings highlight the potential of machine learning algorithms in cardiovascular disease prediction, the paper conscientiously addresses potential issues
associated with model interpretability. In healthcare contexts, especially, the ability to provide clear and understandable explanations for predictions is crucial for
widespread adoption. The challenges linked to interpretability emphasize the importance of bridging the gap between the intricate workings of machine learning models
and the need for transparent, interpretable results in clinical decision-making.
In conclusion, this meta-analysis not only consolidates existing knowledge on machine learning applications in cardiovascular disease prediction but also underscores the
nuanced challenges associated with algorithm selection and interpretability. The identification of promising algorithms and acknowledgment of potential issues
contribute valuable insights to the ongoing efforts in utilizing machine learning for improved cardiovascular health assessment and prevention.
Technology/Methodology used: SVM, CNN, and boosting algorithms implemented using Python.
Issues faced: Potential issues may involve the challenges associated with model interpretability, especially in healthcare contexts where clear explanations for
predictions are crucial for adoption.
3.8 An artificial intelligence model for heart disease detection using machine learning algorithms [2022] [8]
Summary: The paper titled "An Artificial Intelligence Model for Heart Disease Detection using Machine Learning Algorithms" (2022) is centered around the
development of an AI-based system for predicting heart disease through the application and comparison of various machine learning algorithms. Noteworthy algorithms
such as logistic regression, random forest classifier, and K-Nearest Neighbors (KNN) are implemented and systematically evaluated. Among these, the random forest
classifier emerges as the most accurate, achieving an accuracy rate of approximately 83%.
The technology and methodology involve the implementation of machine learning algorithms, including logistic regression, KNN, and the random forest classifier,
utilizing the versatile programming language Python. Python's suitability for machine learning tasks allows for efficient development and assessment of these algorithms,
contributing to the robustness of the AI-based system.
While the results showcase the efficacy of the developed system, the paper conscientiously acknowledges potential challenges, particularly those associated with data
imbalance. Imbalances in the distribution of classes within the dataset might impact the model's predictive performance, especially if certain classes are
underrepresented. Addressing issues of data imbalance becomes crucial for ensuring the model's reliability and its ability to generalize well to diverse scenarios.
Technology/Methodology: Machine learning algorithms like logistic regression, KNN, random forest classifier implemented in Python.
Issues faced: Potential issues could include data imbalance, which might affect the model's predictive performance, especially if certain classes are underrepresented.
3.9 Using Machine Learning for predicting CVDs [2021] [9]
Summary: The paper titled "Using Machine Learning for Predicting CVDs" (2021) is dedicated to the prediction of heart disease through the application of data
analytics and machine learning techniques. The research employs a systematic approach, commencing with feature selection via a correlation matrix, followed by the
BMSIT&M, AIML 2023

application of machine learning algorithms, including neural networks, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN), on datasets of varying sizes.
Notably, neural networks emerge as the standout performer, providing the highest and most stable accuracy, reaching an impressive 93%.
The technology and methodology encompass feature selection using a correlation matrix, a common technique in data analytics, coupled with the implementation of
machine learning algorithms. These algorithms, such as neural networks, SVM, and KNN, are implemented using versatile programming languages like Python, R, or
other applicable tools. This multilanguage approach ensures flexibility and accessibility in the implementation of machine learning models.
While celebrating the success of neural networks, the paper conscientiously addresses potential issues. Concerns about overfitting, particularly with smaller datasets, are
acknowledged. Overfitting occurs when a model learns noise in the training data, hindering its performance on new, unseen data. Additionally, ensuring the stability and
generalizability of the chosen models across varying dataset sizes is recognized as a crucial consideration for the robustness of the predictive models.
Technology/Methodology: Feature selection using a correlation matrix and machine learning algorithms - neural networks, SVM, KNN implemented in Python/R/other
tools.
Issues faced: Potential issues might revolve around overfitting concerns with neural networks, especially with smaller datasets, and ensuring the stability and
generalizability of the chosen models across different dataset sizes.
3.10 Early Heart Disease Detection Using Machine Learning [2020] [10]
Summary: The paper titled "Early Heart Disease Detection Using Machine Learning" (2020) presents an intelligent predictive system designed for the early prediction
and diagnosis of heart disease. The evaluation of this system is conducted on Cleveland (S1) and Hungarian (S2) heart disease datasets, utilizing both full and optimized
feature sets. A comprehensive array of ten classification algorithms, including K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), Naive Bayes
(NB), Support Vector Machine (SVM), AdaBoost (AB), Extra Trees (ET), Gradient Boosting (GB), Logistic Regression (LR), and Artificial Neural Network (ANN), is
employed. Additionally, four feature selection methods, namely Fast Correlation-Based Filter (FCBF), minimum Redundancy Maximum Relevance (mRMR), Least
Absolute Shrinkage and Selection Operator (LASSO), and Relief, are utilized. Performance metrics encompassing accuracy, sensitivity, specificity, Area Under the
Curve (AUC), F1-score, Matthews Correlation Coefficient (MCC), and Receiver Operating Characteristic (ROC) curve are employed for thorough evaluation.
Notably, the Extra Trees algorithm achieves a remarkable accuracy of 94.41% when coupled with Relief feature selection, while Gradient Boosting reaches 93.36%
accuracy with FCBF.
The technology and methodology encompass a diverse set of machine learning algorithms, feature selection methods, and performance evaluation metrics, implemented
through data analysis and processing techniques. However, the paper acknowledges potential challenges, including dataset quality, feature selection bias, and the risk of
model overfitting or underfitting due to the extensive array of algorithms and features tested.
Technology/Methodology used: Machine learning algorithms (KNN, DT, RF, NB, SVM, AB, ET, GB, LR, ANN), Feature selection methods (FCBF, mRMR, LASSO,
Relief), Performance evaluation metrics, Data analysis and processing techniques.
Issues faced: Possible issues might relate to dataset quality, feature selection bias, and model overfitting or underfitting due to an extensive array of algorithms and
features tested. Additionally, the generalizability of the findings to diverse populations or datasets could be a concern.
BMSIT&M, AIML 2023

3.11 Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison [2021] [11]
Summary: This study addresses the challenging task of developing accurate and efficient early-stage heart disease prediction using machine learning and data mining
approaches, aiming to support clinical decision-making with digital patient records. The research focuses on identifying the most accurate machine learning classifiers
for diagnostic purposes, given the lack of cardiovascular expertise in many countries and a notable rate of incorrectly diagnosed cases. Several supervised machine-
learning algorithms, including k-nearest neighbor (KNN), decision tree (DT), and random forests (RF), were applied and compared for performance and accuracy in
heart disease prediction.
The study estimates feature importance scores for each feature, except for multilayer perceptron (MLP) and KNN. All features were ranked based on importance scores
to identify those contributing to high accuracy in heart disease predictions. Utilizing a heart disease dataset from Kaggle, the RF method achieved remarkable results,
attaining 100% accuracy, sensitivity, and specificity in a three-classification scenario. This finding suggests that a relatively simple supervised machine learning
algorithm can yield heart disease predictions with very high accuracy and significant potential utility in clinical settings.
Technology/Methodology Used: Supervised machine-learning algorithms, including KNN, DT, and RF, were employed for heart disease prediction using a dataset
from Kaggle.
Issues Faced: The study does not provide feature importance scores for MLP and KNN, and the dataset used might have its limitations, potentially affecting the
generalizability of the findings. Further research and validation on diverse datasets are essential to confirm the robustness and applicability of the proposed machine
learning approach in real-world clinical scenarios.
BMSIT&M, AIML 2023

3.12 Effectively Predicting the Presence of Coronary Heart Disease Using Machine Learning Classifiers [2022][12]
Summary: Coronary heart disease stands as a leading global cause of mortality, posing a significant challenge in terms of prediction within the realm of clinical data
analysis. Harnessing the potential of machine learning (ML) in diagnostic assistance has become integral for informed decision-making and predictive modelling using
healthcare data on a global scale. Numerous research studies have explored ML applications in disease prediction, particularly focusing on heart diseases. This paper
contributes to this body of knowledge by employing eleven ML classifiers to identify key features crucial for enhancing heart disease predictability.
In crafting the prediction model, a variety of feature combinations and established classification algorithms were utilized. The study yielded promising results, achieving
a commendable 95% accuracy with gradient boosted trees and multilayer perceptron in the heart disease prediction model. Notably, the Random Forest algorithm outper -
formed others, reaching an accuracy level of 96%.
Methodology/Technology Used: The study leverages machine learning classifiers and explores various feature combinations to enhance heart disease predictability.
Specifically, eleven ML classifiers were employed, with a focus on gradient boosted trees, multilayer perceptron, and Random Forest. The combination of these classifi -
ers aimed to discern key features crucial for accurate heart disease predictions.
Issues Faced: While the study showcases promising results, it is essential to consider potential challenges. The paper does not delve into the specific features identified
by each classifier as crucial for heart disease prediction, hindering a comprehensive understanding of the model's interpretability. Additionally, the dataset used may have
limitations, and further validation on diverse datasets is necessary to ascertain the model's robustness and generalizability. Future research should address these issues to
enhance the applicability and reliability of the proposed ML-based heart disease prediction model in real-world clinical scenarios.
BMSIT&M, AIML 2023

3.13 Cardiac disease prediction using AI algorithms with SelectKBest [2023] [13]
Summary: Atherosclerotic cardiovascular disease (ASCVD), encompassing coronary heart disease (CHD) and ischemic stroke, stands as the foremost global cause of
mortality. The European Society of Cardiology reports a staggering 26 million people worldwide afflicted by heart disease, with an annual diagnosis rate of 3.6 million.
Timely detection is crucial in mitigating the mortality associated with heart diseases. Current research in heart disease prediction using artificial intelligence (AI)
grapples with issues such as insufficient diversity in training data and challenges in comprehending complex AI models.
In response to these challenges, this paper proposes cardiac disease prediction using AI algorithms with the integration of SelectKBest. To enhance feature relevance,
standardization, balance, and selection techniques like StandardScaler, SMOTE, and SelectKBest are employed. A comprehensive assessment involves various machine
learning models (support vector machine, K-nearest neighbor, decision tree, logistic regression, adaptive boosting, naive Bayes, random forest, and extra tree) and deep
learning models (vanilla long short-term memory, bidirectional long short-term memory, stacked long short-term memory, and deep neural network) using diverse
datasets.
Methodology/Technology Used: The paper adopts AI algorithms, emphasizing the integration of SelectKBest for feature selection. StandardScaler and SMOTE
techniques are applied to standardize and balance features. A spectrum of machine learning and deep learning models is evaluated using Alizadeh Sani, combined
(Cleveland, Hungarian, Switzerland, Long Beach VA, and Stalog), and Pakistan heart failure datasets.
Issues Faced: The study acknowledges the challenge of insufficient diversity in training data, a common issue in AI-based heart disease prediction research. The
interpretability of complex AI models is also recognized as a hurdle. While the proposed approach showcases impressive results, the paper does not delve deeply into
potential limitations, such as the generalizability of findings to diverse populations.
3.14 Study of cardiovascular disease prediction model based on random forest in eastern China [2020] [14]
Summary: In this paper, there was a study conducted were 29930 subjects with high-risk of CVD were selected from 101056 people in 2014, regular follow-up was
conducted using electronic health record system. Logistic regression analysis showed that nearly 30 indicators were related to CVD, including male, old age, family in-
come, smoking, drinking, obesity, excessive waist circumference, abnormal cholesterol, abnormal low-density lipoprotein, abnormal fasting blood glucose and else. Sev-
eral methods were used to build prediction model including multivariate regression model, classification and regression tree (CART), Naïve Bayes, Bagged trees, Ada
Boost and Random Forest. The paper has used the multivariate regression model as a benchmark for performance evaluation (Area under the curve, AUC = 0.7143). The
results showed that the Random Forest was superior to other methods with an AUC of 0.787 and achieved a significant improvement over the benchmark. They have
provided a CVD prediction model for 3-year risk assessment of CVD. It was based on a large population with high risk of CVD in eastern China using Random Forest
algorithm, which would provide reference for the work of CVD prediction and treatment in China.
BMSIT&M, AIML 2023

Technology/Methodology used: The comparison study uses multivariate regression model as the benchmark and the other models are CART, Naïve Bayes, bagged
trees, Ada Boost and Random Forest.
Issues faced: Dataset used for this comparison of models consists of individuals who are a very small subset of the population and hence results can vary when the entire
population is considered.
BMSIT&M, AIML 2023

CHAPTER 4 REFERENCES
[1] M Sai Teja, K Thanuja, Nadella Mani Deep, P Ravindra Reddy, & O Likhith Kumar Reddy. (2023). Prediction and Analysis of Alzheimer’s Disease using Deep
Learning Algorithms. International Journal of Computational Learning & Intelligence, 2(2), 48–57. https://doi.org/10.5281/zenodo.7920940
[2] Oh, K., Chung, YC., Kim, K.W. et al. Classification and Visualization of Alzheimer’s Disease using Volumetric Convolutional Neural Network and Transfer
Learning. Sci Rep 9, 18150 (2019).
https://doi.org/10.1038/s41598-019-54548-6
[3] Ayisha Shamna .K K, Jamsheera .K, Shameena .P P. CNN Based Landmark Detection and Alzheimer’s Diagnosis Using Landmark Feature, International Journal of
Advance Research, Ideas and Innovations inTechnology, www.IJARIIT.com.
[4] Raees, P & Thomas, Vinu. (2021). Automated detection of Alzheimer’s Disease using Deep Learning in MRI. Journal of Physics: Conference Series. 1921. 012024.
10.1088/1742-6596/1921/1/012024.
[5] Kavitha C., Mani Vinodhini, Srividhya S. R., Khalaf Osamah Ibrahim, Tavera Romero Carlos Andrés. "Early-Stage Alzheimer's Disease Prediction Using Machine
Learning Models." Frontiers in Public Health, 10 (2022), https://www.frontiersin.org/articles/10.3389/fpubh.2022.853294, DOI:10.3389/fpubh.2022.853294, ISSN:
2296-2565.
[6] Al-Shoukry, Suhad & Rassem, Taha & Makbol, Nasrin. (2020). Alzheimer’s Diseases Detection by Using Deep Learning Algorithms: A Mini-Review. IEEE Access.
PP. 1-1. 10.1109/ACCESS.2020.2989396
[7] Sanjay, V & Swarnalatha, P.. (2022). Deep Learning Techniques for Early Detection of Alzheimer’s Disease: A Review. International Journal of Electrical and
Electronics Research. 10. 899-905. 10.37391/ijeer.100425.
[8] Gamal, Aya & Elattar, Mustafa & Selim, Sahar. (2022). Automatic Early Diagnosis of Alzheimer’s Disease Using 3D Deep Ensemble Approach. IEEE Access. PP.
1-1. 10.1109/ACCESS.2022.3218621
[9] Arsah A1 , Karolin Kiruba R2 , Kishan I3 , Neekitha C4 , Padmapriya M5 International Journal for Research in Applied Science & Engineering Technology
(IJRASET) ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538 Volume 11 Issue V May 2023
[10] Nair Bini Balakrishnan1 , P.S. Sreeja2 , Jisha Jose Panackal3 Alzheimer’s Disease Diagnosis using Machine Learning: A Review Volume 71 Issue 3, 120-129,
March 2023 ISSN: 2231 – 5381 / https://doi.org/10.14445/22315381/IJETT-V71I3P213
[11] Sheng Liu1, ArjunV. Masurkar2,3, Henry Rusinek4,5, Jingyun Chen2,4, Ben Zhang4, Weicheng Zhu1, Carlos Fernandez -Granda1,6🖂 & Narges Razavian1,2,4,
“Generalizable deep learning model for early Alzheimer’s disease detection from structural MRIs” (2022) 12:17106 | https://doi.org/10.1038/s41598-022-20674-x
[12] Alroobaea, Roobaea & Mechti, Seifeddine & Haoues, Mariem & Rubaiee, Saeed & Ahmed, Anas & Andejany, Murad & Bragazzi, Nicola & Sharma, Dilip &
Kolla, Bhanu & Sengan, Sudhakar. (2021). Alzheimer's Disease Early Detection Using Machine Learning Techniques. 10.21203/rs.3.rs-624520/v1.
BMSIT&M, AIML 2023

[13] Lee Kuok Leong1 and Azian Azamimi Abdullah1 “Prediction of Alzheimer’s disease (AD) Using Machine Learning Techniques with Boruta Algorithm as Feature
Selection Method” IOP Publishing doi:10.1088/1742-6596/1372/1/012065
[14] Muhanad Tahrir Younis1 , Younus Tahreer Younus 2 , Jamal Naser Hasoon1 , Ali Hussain Fadhil3 , Salama A. Mostafa4 “An accurate Alzheimer's disease
detection using a developed convolutional neural network model” Vol. 11, No. 4, August, pp. 2005~2012 ISSN: 2302-9285, DOI: 10.11591/eei.v11i4.3659
[15] Citation: Odusami, M.; Maskeliunas, ̄ R.; Damaševiˇcius, R. An Intelligent System for Early Recognition of Alzheimer’s Disease Using Neuroimaging. Sensors
2022, 22, 740. https://doi.org/10.3390/s22030740
[16] A, Shiny & S .S, Suganyadevi & Rajasekaran, Arun & P, Satheesh & R, Suganthi & Ramalingam, Naveenkumar. (2023). Alzheimer’s Disease Diagnosis using
Deep Learning Approach. 1205-1209. 10.1109/ICEARS56392.2023.10085017.
BMSIT&M, AIML 2023

BMSIT&M, AIML 2023

Ara Kau Sand Raj

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ara Kau Sand Raj

Uploaded by

Copyright:

Available Formats

VISVESWARAYA TECHNOLOGICAL UNIVERSITY Jnana Sangama, Belagavi-590018

A Project Work Phase-1 Intermediate Report on

Under The Guidance Of

Department of Artificial Intelligence and Machine Learning

BMS INSTITUTE OF TECHNOLOGY AND MANAGEMENT

(Autonomous Institute Affiliated to VTU)

Avalahalli, Yelahanka, Bengaluru – 560064.

2023-2024 (Odd Sem)

VISVESWARAYA TECHNOLOGICAL UNIVERSITY Jnana Sangama, Belagavi-590018

(Autonomous Institute Affiliated to VTU)

Department of Artificial Intelligence and Machine Learning

--------------------- ----------------------- ---------------------

Assistant Professor AI&ML, BMSIT&M Prof. & HOD, AI&ML, BMSIT&M

Artificial Intelligence and Machine Learning during academic year 2023-24.

Place: Yelahanka, Bangalore

BMS INSTITUTE OF TECHNOLOGY, BANGALORE-560064

Students Project Review and Assessment Committee

BMSIT&M, AIML 2023

Batch No: Guide Name: Submission Date:

Personalized Adaptive Learning Platform Empowered by Artificial Intelligence

1 1BY20AI009 Aravind Suresh

3 1BY20AI044 Sandeep Arockia Samraj X

4 1BY21AI402 Raj Powell

Project Category Application Project

Signature of HoD Signature of the Guide SPARC

BMS INSTITUTE OF TECHNOLOGY & MANAGEMENT

Yelahanka, Bangalore – 560 064

Department of Artificial Intelligence and Ma- chine Learning

Synopsis for the Project work

BMSIT&M, AIML 2023

Aravind Suresh, 1BY20AI009

Raj Powell, 1BY21AI402

Under the Guidance of

us make this a success.

couragement and inspiration in taking up this Project Work Phase-1.

BMSIT&M, AIML 2023

1.2 Purpose of the problem 11

1.3 Scope of the project 11

2.1 Problem statement 13

3.1 "Heart Disease Prediction using Random Forest Classifier" 14

3.2 "Heart Disease Prediction using Machine Learning" 15

BMSIT&M, AIML 2023

3.3 "Heart Disease Prediction using Machine Learning Algorithms" 16

3.4 “Heart Disease Prediction System” 17

3.5 "Cardiovascular Disease Detection using Ensemble Learning" 18

3.7 "Machine Learning Prediction in Cardiovascular Diseases: A Meta-Analysis" 20

3.9 "Using Machine Learning for Predicting CVDs" 22

3.10 "Early Heart Disease Detection Using Machine Learning" 23

3.13 “Cardiac disease prediction using AI algorithms with SelectKBest” 26

Healthy V/S Diseased heart 12

BMSIT&M, AIML 2023

racy and provide interpretable insights into the disease.

1.2 Purpose of the problem

1.3 Scope of the project

Ensemble Methods: Techniques combining multiple models to improve overall performance.

CHAPTER 2 PROBLEM DEFINITION

methods necessitate the integration of ML for enhanced accuracy.

2.2 Objectives of the proposed project

BMSIT&M, AIML 2023

CHAPTER 3 LITERATURE SURVEY

accuracy and its ability to correctly identify instances of heart disease.

widely-used programming language for machine learning tasks.

Technology/Methodology used: Random Forest Classifier implemented using Python