You are on page 1of 33

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“Jnana Sangama”, Belgaum-590 014, Karnataka.

A Mini Project Report on

“DEMENTIA DISEASE PREDICTION”


USING MACHINE LEARNING
Submitted in the fulfillment of the requirements for the award of the Degree of

BACHELOR OF ENGINEERING
IN
INFORMATION SCIENCE AND ENGINEERING

Submitted by
DINESH N
(1EW20IS026)
Under the Guidance of
Ms. PRATYAKSHA S
Asst. Professor, Dept of ISE

DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING

EAST WEST INSTITUTE OF TECHNOLOGY


BANGALORE - 560 091
2023-2024
EAST WEST INSTITUTE OF TECHNOLOGY
Sy. No.63, Off. Magadi Road, Vishwaneedam Post, Bangalore - 560 091
(Affiliated to Visvesvaraya Technological University, Belgaum)
DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING

CERTIFICATE
This is to certify that the Internship project work entitled “DEMENTIA DISEASE PREDICTION”
presented by DINESH N (1EW20IS026), bonafide students of EAST WEST INSTITUTE OF
TECHNOLOGY, Bangalore in partial fulfilment for the award of Bachelor of Engineering in
Information Science and Engineering of Visvesvaraya Technological University, Belgaum during
the year 2023-2024. It is certified that all corrections/suggestions indicated have been incorporated in
the report. The internship work has been approved as it satisfies the academic requirements in respect
of the internship project prescribed for the said degree.

Signature of Guide Signature of HOD Signature of Principal


Ms. Pratyaksha S Dr. Suresh M B Dr. K Channakeshavalu
Assistant Prof, Dept. of ISE Prof & Head, Dept. of ISE Principal
EWIT, Bangalore EWIT, Bangalore EWIT, Bangalore

External Viva

Name of the Examiners Signature with date

1.

2.
CERTIFICATE FROM THE ORGANIZATION
EAST WEST INSTITUTE OF TECHNOLOGY
Sy. No.63, Off. Magadi Road, Vishwaneedam Post, Bangalore - 560 091
(Affiliated to Visvesvaraya Technological University, Belgaum)
DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING

DECLARATION
I, DINESH N , Student of Seventh Semester B.E ,in the Department of Information Science and
Engineering, East West Institute of Technology, Bangalore hereby declare that the internship entitled
"DEMENTIA DISEASE PREDICTION" using MACHINE LEARNING has been carried out by
me and submitted in partial fulfilment of course requirements for the award of degree in Bachelor of
Engineering in Information Science and Engineering discipline of Visvesvaraya Technological
University, Belgaum during the academic year 2023-2024. Further, the matter embodied in internship
report has not been submitted previously by anybody for the award of any degree or diploma to any
other university.

Place: Bangalore NAME: DINESH N


Date: 22/11/2023 USN: 1EW20IS026
ABSTRACT

Dementia is a growing public health concern, and early detection of the disease is crucial for
effective intervention and care. This proposed system aims to predict whether the person is
demented or non-demented. We utilize a diverse dataset comprising various demographic,
clinical, and neuropsychological features to train the model. Data preprocessing techniques
such as data cleaning, normalization, and feature engineering are used to prepare data for the
machine learning model. The model is trained on the preprocessed data, and various algorithm
such as Decision Tree, K-Nearest Neighbors [KNN], Random Forest and Naive Bayes are
implemented to check the model performance and accuracy. The Random Forest algorithm,
known for its robustness and ability to handle complex datasets, is employed for predictive
modeling. The results show promising accuracy, sensitivity, and specificity in identifying
individuals at risk of developing dementia. This proposed system contributes to the ongoing
efforts in dementia research and underscores the potential of machine learning techniques in
early disease detection and intervention

i
ACKNOWLEDGEMENT
Any achievement, be it scholastic or otherwise does not depend solely on individual efforts but on the
guidance, encouragement, and cooperation of intellectuals, elders, and friends. A number of
personalities, in their own capacities, have helped me in carrying out this project work. We would like
to take this opportunity to thank them all.

First and foremost, we would like to thank Dr. K Channakeshavalu, Principal, EWIT, Bangalore, for
his moral support in completing our project work.

We would like to thank, Dr Suresh M B, Professor, and Head of Department of ISE, EWIT, Bangalore,
for his valuable suggestions and expert advice.

We deeply express our sincere gratitude to our guide Ms. Pratyaksha S, Assistant Professor,
Department of ISE, EWIT, Bangalore for his/her able guidance throughout the project work and for
guiding me to organize the report in a systematic manner.
I would like to express my sincere gratitude to my supervisor Mr. Mahesh for providing his invaluable
guidance, comments, and suggestions throughout the course of the internship. During the period of my
internship work. I have received generous help from many quarters, without their help, it would have
been impossible to finish my work.
We thank my Parents and all the Faculty members of the Department of Information Science &
Engineering for their constant support and encouragement.

Last, but not least, we would like to thank our peers and friends who provided us with valuable
suggestions to improve our mini-project.

DINESH N (1EW20IS026)

ii
TABLE OF CONTENTS

ABSTRACT i
ACKNOWLEDGEMENT ii
LIST OF FIGURES iv

CHAPTERS
CHAPTER NO TITLE PAGE NO

1 INTRODUCTION 1

2 COMPANY PROFILE 2-5

3 SYSTEM ANALYSIS 6

4 SYSTEM REQUIREMENT 7-10

5 ARCHITECTURE 11-16

6 PHASE OF IMPLEMENTATION 17

7 SNAPSHOTS 18-23

8 CONCLUSION

9 REFERENCES

iii
LIST OF FIGURES

FIGURE NO FIGURE NAME PAGE NO

5.1 WORKING OF RANDOM FOREST ALGORITHM 12

5.2 WORKING OF DECISION TREE CLASSIFIER 14

7.1 DISTRIBUTION OF GROUP BY GENDER 18

7.2 DISTRIBUTION OF EDUCATION FOR EACH 18


GENDER AND GROUP

7.3 HEATMAP TO VISUALIZE CORELATION OF 19


ATTRIBUTES

7.4 DISTRIBUTION OF AGE FOR EACH GROUP 19

7.5 CLASSIFICATION OF EDUCATION 20

7.6 CLASSIFICATION OF AGE 20

7.7 CLASSIFICATION OF CLINICAL DEMENTIA 21


RATINNG

7.8 CLASSIFICATION OF MMSE 21

7.9 HOME PAGE OF DEMENTIA PREDICTION 22

7.10 APPLICATION PAGE WITH INPUT VALUES 22

7.11 PREDICTED OUTPUT INDICATING THAT THE 23


PERSON IS DEMENTED

7.12 PREDICTED OUTPUT INDICATING THAT THE 23


PERSON IS NOT DEMENTED

iv
Dementia Disease Prediction

CHAPTER 1
INTRODUCTION

Dementia is a broad category of neurodegenerative disorders characterized by a decline in cognitive


function that interferes with a person's daily life. Alzheimer's disease is one of the most common forms
of dementia, but there are several other types as well. Dementia is a syndrome that can be caused by a
number of diseases which over time destroy nerve cells and damage the brain, typically leading to
deterioration in cognitive function (i.e, the ability to process thought) beyond what might be expected
from the usual consequences of biological ageing. While consciousness is not affected, the impairment
in cognitive function is commonly accompanied, and occasionally preceded, by changes in mood,
emotional control, behavior, or motivation. Dementia has physical, psychological, social and economic
impacts, not only for people living with dementia, but also for their careers, families and society at
large. There is often a lack of awareness and understanding of dementia, resulting in stigmatization and
barriers to diagnosis and care.

In Dementia Disease Prediction, the first step is to collect the Data from various hospital and medical
records of the patients. You can use Python package like ‘Pandas’ to retrieve the data from the CSV
file. Once the data is collected, it needs to be prepared for analysis. This involves cleaning, formatting,
and transforming the data. Python provides package like ‘Pandas’ and that can be used for data cleaning,
transformation, and improves computational speed. Exploratory Data Analysis (EDA) is an important
step in data analysis. It helps to understand the patterns and relationships in the data. Python provides
various packages like ‘Matplotlib’ that can be used for data visualization. After EDA, statistical analysis
can be performed on the data. Python provides packages like ‘Seaborn’ that can be used for statistical
analysis. Machine learning algorithms can be applied to the data for predictive modeling. Python
provides package like ‘Scikit-learn’ that can be used for machine learning. Finally, the results of the
analysis need to be communicated effectively.

FEATURES:
 Data Collection
 Data cleaning and transformation
 Exploratory data Analysis
 Statistical Analysis
 Machine Learning

Dept. Of ISE, EWIT 2023-2024 Page-1


Dementia Disease Prediction

CHAPTER 2
COMPANY PROFILE

It is pleasure in introducing “Karunadu Technologies Private Limited” as a leading IT software


solutions and services industry focusing on quality standards and customer values. It is also a leading
Skills and Talent Development company that is building a manpower pool for global industry
requirements.

The company offers broad range of customized software applications powered by concrete technology
and industry expertise. It also offers end to end embedded solutions and services. They deal with broad
range of product development along with customized features ensuring at most customer satisfaction
and also empower individual with knowledge, skills and competencies that assist them to escalate as
integrated individuals with a sense of commitment and dedication.

MISSION
 Provide cost effective and reliable solutions to customers across various latest technologies.

 Offer scalable end-to-end application development and management solutions.

 Provide cost effective highly scalable products for varied verticals.

 Focus on creating sustainable value growth through innovative solutions and unique partnerships.

 Create, design and deliver business solutions with high value and innovation by leveraging
technology expertise and innovative business models to address long-term business objectives.

VISION
To Empower Unskilled Individual with knowledge, skills and technical competencies in the field
of Information Technology and Embedded engineering which assist them to escalate as integrated
individuals contributing to company’s and Nation’s growth.

Dept. Of ISE, EWIT 2023-2024 Page-2


Dementia Disease Prediction

 To develop software and Embedded solutions and services focusing on quality standards and
customer values.
 Offer end to end embedded solutions which ensure the best customer satisfaction.
 To build Skilled and Talented manpower pool for global industry requirements.
 To develop software and embedded products which are globally recognized.
 To become a global leader in Offering Scalable and cost-effective Software solutions and services
across various domains like E-commerce, Banking, Finance, Healthcare and much more.

OBJECTIVES
 To develop software and Embedded solutions and services focusing on quality standards and
customer values.
 Offer end to end embedded solutions which ensure the best customer satisfaction.
 To build Skilled and Talented manpower pool for global industry requirements.
 To develop software and embedded products which are globally recognized.
 To become a global leader in Offering Scalable and cost-effective Software solutions and services
across various domains like E-commerce, Banking, Finance, Healthcare and much more.
 To generate employment for skilled and highly talented youth of our Country INDIA.

COMPANY PRODUCTS AND SERVICES OFFERED


PRODUCTS

1. KECMS – Karunadu Enterprise Content Management System

Karunadu Enterprise Content Management System is a one stop solution for all our enterprise content
management System relating to digital asset management, document imaging, workflow systems and
records management systems. Increasing digitalization has led to an exponential growth in business
content and managing this sea of unstructured data is tedious work.

2. KEMS – Karunadu Education Management System

Manage diversified data relating to education management on cloud. Educational data including
students and staff is gathered over years which contain information from admission/appointment until
leaving the Education. Statistical reports for the College/school can be generated along with admission
Tracking and result analysis to keep track of progressive improvements of both student and staff.

Dept. Of ISE, EWIT 2023-2024 Page-3


Dementia Disease Prediction

3. KASS – Karunadu Advanced Security System

A Complete one stop embedded solution for large apartments. Security system which monitors door
breakage, window breakage, gas leakage, motion detection and various other features which can be
operated and maintained by centralized monitored system. This Embedded solution enhances the
security measures of apartment/building and enhances the security of individuals may be from
unintended intervention or from unauthorized access.

SERVICES

1. IT SOLUTIONS AND SERVICES

Karunadu Technologies is a Bangalore based IT Training and Software Development center with an
exclusive expertise in the area of IT Services and Solutions. Karunadu Technologies Pvt. Ltd. is also
expertise in Web Designing and Consulting Services.

2. EMBEDDED DESIGN AND DEVELOPMENT

Karunadu Technologies Pvt. Ltd. has expertise in Design and development of embedded products and
offers solutions and services in field of Electronics.

3. ACADEMIC PROJECTS

Karunadu Technologies Pvt. Ltd. helps students in their academics by imparting industrial experience
into projects to strive excellence of students. Karunadu Technologies Pvt. Ltd. encourages students to
implement their own ideas to projects keeping in mind "A small seed sown upfront will be nourished
to become a large tree one day”, thereby focusing the future entrepreneurs. They have a wide range of
IEEE projects for B.E, MTech, MCA, BCA, DIPLOMA students for all branches in each and every
domain.

4. INPLANT TRAINING

Karunadu Technologies Pvt. Ltd. provides Inplant training for students according to the interest of
students keeping in mind the current technology and academic benefit one obtains after completing the
training. Students will be nourished and will be trained throughout with practical experience.
Students will be exposed to industrial standards which boost their carrier.

Dept. Of ISE, EWIT 2023-2024 Page-4


Dementia Disease Prediction

Students will become Acquaint to various structural partitions such as labs, workshops, assembly units,
stores, and administrative unit and machinery units. They help students to understand their functions,
applications, and maintenance. Students will be trained from initial stage that is from collection of
Project Requirements, Project Planning, Designing, implementation, testing, deployment and
maintenance there by helping to understand the business model of the industry. Entire project life cycle
will be demonstrated with hands on experience. Students will also be trained about management skills
and team building activities. They assure that by end of implant training students will Enhance
communication skills and acquire technical skills, employability skills, start-up skills, and will be aware
of risks in industry, management skills and many other skills which are helpful to professional
engagement.

5. SOFTWARE COURSES

Karunadu Technologies Pvt. Ltd. provides courses for students according to the interest of students
keeping in mind the current technology and assist them for their further Employment. Company
provides various courses such as C, C++, VB, DBMS, Dot Net, Core Java and J2EE along with live
projects.

CONTACT DETAILS

Contact Address: #17, ATK complex, 4th Floor, Acharya College Main Road, Beside Karur Vysya
Bank, Guttebasaveshwaranagar, Chikkabanvara, Bengaluru, Karnataka- 560090
Email Address: support@karunadutechnologies.com

Dept. Of ISE, EWIT 2023-2024 Page-5


Dementia Disease Prediction

CHAPTER 3
SYSTEM ANALYSIS

EXISTING SYSTEM
Existing systems for dementia disease prediction employ a variety of approaches, combining medical
and technological advancements. These systems often utilize machine learning algorithms trained on
extensive datasets that include patient demographics, genetic information, and cognitive assessments.
Advanced imaging techniques, such as MRI or PET scans, are also integrated to analyze brain structures
and detect potential abnormalities. Additionally, some systems leverage wearable devices to monitor
daily activities and behavioral patterns that may indicate early signs of dementia. The goal is to create
comprehensive models capable of accurate and early prediction, enabling timely interventions and
personalized care for individuals at risk of developing dementia. Ongoing research and technological
advancements continue to refine and enhance these prediction systems, contributing to the broader field
of neurodegenerative disease detection and management.

PROPOSED SYSTEM
A proposed system for dementia disease prediction aims to integrate cutting-edge technologies for
more accurate and early detection. The system envisions a holistic approach, incorporating machine
learning algorithms trained on diverse and extensive datasets encompassing not only traditional
medical data but also lifestyle factors, environmental influences, and social interactions. It emphasizes
the use of advanced neuroimaging techniques, such as functional MRI and diffusion tensor imaging,
to capture subtle changes in brain structure and connectivity. Wearable devices equipped with sensors
would continuously monitor various biomarkers, including sleep patterns, physical activity, and vital
signs. Additionally, the proposed system places a strong emphasis on incorporating genetic information
to enhance personalized risk assessments. Real-time data analysis and continuous learning mechanisms
are integral components, ensuring adaptability and responsiveness to individual variations. Overall,
this comprehensive and multidimensional approach aims to revolutionize dementia prediction by
providing a more nuanced and proactive understanding of an individual's risk profile.

Dept. Of ISE, EWIT 2023-2024 Page-6


Dementia Disease Prediction

CHAPTER 4:
SYSTEM REQUIREMENTS

HARDWARE REQUIREMENTS:

OS (Operating System) Windows 10

Processor Intel I5 2.1 Ghz.

Storage 100 GB

RAM Minimum 4GB

SOFTWARE REQUIREMENTS:

Programming Language Python 3x


Front End or Web Technologies HTML5, CSS, BOOTSTRAP4
Web Frame works Django 2x
IDE (Integrated Development Environment) PyCharm IDE Community Edition 2021.2.3
APIs NumPy, Pandas, Sklearn, Matlib
Technology used Machine Learning

HTML

HTML is the Hyper Text Markup Language it is used for creation of websites or web pages. For
creation of website/web pages we are using Cascading Style Sheet (CSS) it is used to create styles for
your web pages like font, color, animation and JavaScript it is used for validation purpose. Web
browser get HTML file from a web server and we can see the website page in any type of browsers.
HTML describes the structure of a web page, and it is the tag-based language.

CSS

CSS is used for while creating web page adding style in that in a simple and easiest way. CSS
explanation "Cascading Style Sheet". Cascading Style Sheets, also known as CSS, it is simple style-
based language to make website attractive.

Dept. Of ISE, EWIT 2023-2024 Page-7


Dementia Disease Prediction

BOOTSTRAP 4
Bootstrap is an open-source framework used to develop the responsive web applications or responsive
designs. Responsive means application should be runs on smaller screens like mobile phones and
tablets. Every element of the HTML document gets stacked when the page gets smaller or minimized.
By default, bootstrap takes 12 columns of width with equal separation of the columns that means every
column having same size. But you can alter the default values and you can make layouts, design
according to your requirements using <span> tag. Bootstraps provide grid system for all kind of devices
such as normal, medium and short which can help to run the app on every devices. Further it provides
some stylish buttons, forms, tables and so on. Bootstrap 4 is the newest version with some additional
features compare to previous versions. In this project bootstrap 4 is used for the front development
along with the Django framework.

MACHINE LEARNING
Machine Learning is the field of study that gives computers the capability to learn without being
explicitly programmed. ML is one of the most exciting technologies that one would have ever
come across. As it is evident from the name, it gives the computer that makes it more similar to
humans:The ability to learn. Machine learning is actively being used today, perhaps in many more
places than one would expect.

Machine learning algorithms build a model based on sample data, known as training data, in order
to make predictions or decisions without being explicitly programmed to do so. Machine learning
algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech
recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop
conventional algorithms to perform the needed tasks.

The typical machine learning project life cycle involves defining the problem, building a solution,
and measuring the solution's impact on the business. However, before getting started with any
machine learning project, it is essential to realize how prevalent the exercise of exploratory
data analysis is in any machine learning project. 80% of a data scientist’s job is to explore and
understand raw data, generate insights by cleaning, wrangling, and analyzing it, and. If the EDA is
absent or insufficient, the team’s knowledge of the data is incomplete. Without sufficient
understanding of the data, calibration of analytical algorithms, ML models, or creating a
compellingproduct or solution becomes extremely unreliable.

Dept. Of ISE, EWIT 2023-2024 Page-8


Dementia Disease Prediction

It is clear how significant the manual study and analysis of data is for a data scientist and machine
learning engineers, AI researchers, and data science students. The motivation of course, extends
to analysis of data from Black Friday sales as well – especially for Doctors and Patients. Now that
our motivation for the Black Friday sales Prediction project using machine learning in Python is
clear let's look at the publicly available Black Friday sales datasets for data analysis and
prediction.

PYTHON LANGUAGE
Python is a powerful programming language and software environment for statistical computing
and graphics. Python provides a wide range of libraries and packages that can be used for data
analysis, visualization, and modelling.

1. Data Collection: The first step is to collect the Data was obtained from 550069 people from
various Retail company you can use Python package like ‘Pandas’ to retrieve the data from the
CSV file.

2. Data Preparation: Once the data is collected, it needs to be prepared for analysis. This involves
cleaning, formatting, and transforming the data. Python provides package like ‘Pandas’ and that
can be used for data cleaning, transformation, and improves computational speed.

3. Exploratory Data Analysis: Exploratory Data Analysis (EDA) is an important step in data
analysis. It helps to understand the patterns and relationships in the data. Python provides
various packages like ‘Matplotlib’ that can be used for data visualization.

4. Statistical Analysis: After EDA, statistical analysis can be performed on the data. Python
provides packages like ‘Seaborn’ that can be used for statistical analysis.

5. Machine Learning: Machine learning algorithms can be applied to the data for predictive
modeling. Python provides package like ‘Scikit-learn’ that can be used for machine learning.

DJANGO
Django is high level web framework in python which is developed and maintain by DSF (Django
Software Foundation). Now a days Django widely in used because of its more built-in functionalities.
There are some famous and well-known companies and apps are using Django for the development of
their websites and those companies and apps are Google, Instagram, Disqus, Spotify, You Tube etc.,

Dept. Of ISE, EWIT 2023-2024 Page-9


Dementia Disease Prediction

It is used in web development in python. It supports templates and static files that means you can easily
render the HTML pages by putting all the HTML files in the directory called ‘templates’ and similarly
you can place all the files related to styles like CSS and JS will be placed inside the directory called
‘static’. In this project Django is used for the front-end development. Further Django provide more
features as compared to other frameworks and those features are given below.
 Built in localhost server.

 Built in administration facility.

 High security.

 Rapid development.

 Outstanding documentation.

PANDAS
Pandas is the library for the python language which is used for understand and analysis of data. It
provides some data pattern to analyze using different way process.

Dept. Of ISE, EWIT 2023-2024 Page-10


Dementia Disease Prediction

CHAPTER 5:
ARCHITECTURE
Dementia Disease Prediction architecture involves several different components that work together
to process and analyze large volumes of data. Python language is used in various stages of this
architecture for Data analysis, Training and Testing of Model, and Data Visualization. Here are
some of the components of Dementia Disease Prediction architecture:
1. DATA ANALYSIS: Collect the data from a csv file. The dataset includes information about
Subject_ ID, MRI_ ID, Visit, Delay, M/F, Hand, Age, Education [EDUC], Mini-Mental State
Examination [MMSE], Clinical Dementia Rating [CDR], Atlas Scale Factor [ASF], Estimated
Total Intracranial Volume [eTIV], Socio-Economic Status [SES] and Whole-Body Vibration
[nWBV].

2. DATA SPLITTING: Split the data into training and testing data using python libraries.

3. TRAINING THE MODEL: Dementia Disease Prediction uses a variety of machine learning
algorithms, to analyze data and make predictions about patients whether they are demented or
not.
4. MODEL PREDICTION: After training the model with the training data, the model is then
tested to predict the expected outcome. Also the model accuracy is analyzed to achieve better
performance.

ALGORITHMS USED:

1. RANDOM FOREST

Random Forest is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML. It is based on the
concept of ensemble learning, which is a process of combining multiple classifiers to solve a
complex problem and to improve the performance of the model. As the name suggests, "Random
Forest is a classifier that contains a number of decision trees on various subsets of the given dataset
and takes the average to improve the predictive accuracy of that dataset”.

Dept. Of ISE, EWIT 2023-2024 Page-11


Dementia Disease Prediction

Instead of relying on one decision tree, the random forest takes the prediction from each tree and based
on the majority votes of predictions, and it predicts the final output. The greater number of trees in the
forest leads to higher accuracy and prevents the problem of overfitting.

Fig 5.1 Working of the Random Forest algorithm.

HOW DOES RANDOM FOREST ALGORITHM WORK?

Random Forest works in two-phase first is to create the random forest by combining N decision
tree, and second is to make predictions for each tree created in the first phase. The Working
process can be explained in the below steps and diagram:
1. Select random K data points from the training set.

2. Build the decision trees associated with the selected data points (Subsets).

3. Choose the number N for decision trees that you want to build.

4. Repeat Step 1 & 2.

5. For new data points, find the predictions of each decision tree, and assign the new data points
to the category that wins the majority votes.

APPLICATIONS OF RANDOM FOREST


There are mainly four sectors where Random forest mostly used:

 Banking: Banking sector mostly uses this algorithm for the identification of loan risk.

 Medicine: With the help of this algorithm, disease trends and risks of the disease can be
identified.
 Marketing: Marketing trends can be identified using this algorithm.

Dept. Of ISE, EWIT 2023-2024 Page-12


Dementia Disease Prediction

ADVANTAGES OF RANDOM FOREST


 Random Forest is capable of performing both Classification and Regression tasks.
 It is capable of handling large datasets with high dimensionality.
 It enhances the accuracy of the model and prevents the overfitting issue.

DISADVANTAGES OF RANDOM FOREST


 Although random forest can be used for both classification and regression tasks, it is not more
suitable for Regression tasks.

2. NAIVE BAYES:

Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used
for solving classification problems. It is mainly used in text classification that includes a high-
dimensional training dataset. Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine learning models that can make quick
predictions. It is a probabilistic classifier, which means it predicts on the basis of the probability of an
object. Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis,
and classifying articles.

WORKING OF NAIVE BAYES' CLASSIFIER:

Working of Naïve Bayes' Classifier can be understood with the help of the below example: Suppose we
have a dataset of weather conditions and corresponding target variable "Play". So using this dataset
we need to decide that whether we should play or not on a particular day according to the weather
conditions. So, to solve this problem, we need to follow the below steps:

1. Convert the given dataset into frequency tables.

2. Generate Likelihood table by finding the probabilities of given features.

3. Now, use Bayes theorem to calculate the posterior probability.

APPLICATIONS OF NAIVE BAYES CLASSIFIER:


 It is used for Credit Scoring.

 It is used in medical data classification.

 It can be used in real-time predictions because Naïve Bayes Classifier is an eager learner.

Dept. Of ISE, EWIT 2023-2024 Page-13


Dementia Disease Prediction

ADVANTAGES OF NAÏVE BAYES CLASSIFIER:

 Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
 It can be used for Binary as well as Multi-class Classifications.
 It performs well in multi-class predictions as compared to the other Algorithms. 

 It is the most popular choice for text classification problems. 

DISADVANTAGES OF NAÏVE BAYES CLASSIFIER:

 Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the
relationship between features.

3. DECISION TREE:

Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-
structured classifier, where internal nodes represent the features of a dataset, branches represent the
decision rules, and each leaf node represents the outcome. It is a graphical representation for getting
all the possible solutions to a problem/decision on based on given conditions. In a Decision tree, there
are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any
decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not
contain any further branches.

Fig.5.2 Working of Decision Tree Classifier.

Dept. Of ISE, EWIT 2023-2024 Page-14


Dementia Disease Prediction

WORKING OF DECISION TREE:

In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node
of the tree. This algorithm compares the values of root attribute with the record (real dataset) attribute
and, based on the comparison, follows the branch and jumps to the next node. For the next node, the
algorithm again compares the attribute value with the other sub-nodes and move further. It continues
the process until it reaches the leaf node of the tree. The complete process can be better understood
using the below algorithm:

 Begin the tree with the root node, says S, which contains the complete dataset.

 Find the best attribute in the dataset using Attribute Selection Measure (ASM).

 Divide the S into subsets that contains possible values for the best attributes.

 Generate the decision tree node, which contains the best attribute.

 Recursively make new decision trees using the subsets of the dataset created in step3.

 Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.

APPLICATIONS:

1. Classification Problems: Decision trees are commonly used for classification tasks, where
the goal is to categorize input data into predefined classes or labels. This is applicable in fields
such as spam detection, sentiment analysis, and medical diagnosis.
2. Regression Analysis: Decision trees can be used for regression tasks, predicting a continuous
value instead of a categorical label. This is applied in scenarios like predicting house prices,
stock prices, or any other numerical outcome.
3. Customer Relationship Management (CRM): Decision trees help in customer segmentation
and targeting. They can identify customer groups with similar characteristics and behaviors,
aiding in personalized marketing and service strategies.
4. Medical Diagnosis: Decision trees are used in healthcare for diagnostic purposes. They assist
in identifying diseases or conditions based on patient symptoms, test results, and medical
history.
5. Credit Scoring: In finance, decision trees are applied to assess credit risk. They help
determine the likelihood of a borrower defaulting based on various financial and personal
factors.

Dept. Of ISE, EWIT 2023-2024 Page-15


Dementia Disease Prediction

6. Fraud Detection: Decision trees play a role in detecting fraudulent activities by analyzing
patterns and anomalies in financial transactions or user behavior.
7. Image Recognition: Decision trees are part of the ensemble methods used in image
recognition. They contribute to complex models that can effectively classify and recognize
objects in images.

ADVANTAGES

 It is simple to understand as it follows the same process which a human follow while making any
decision in real-life.
 It can be very useful for solving decision-related problems.

 It helps to think about all the possible outcomes for a problem.

 There is less requirement of data cleaning compared to other algorithms.

DISADVANTAGES

 The decision tree contains lots of layers, which makes it complex.

 It may have an overfitting issue, which can be resolved using the Random Forest algorithm.

 For more class labels, the computational complexity of the decision tree may increase.

Dept. Of ISE, EWIT 2023-2024 Page-16


Dementia Disease Prediction

CHAPTER 6

PHASE OF IMPLEMENTAION
1. Open the Python Console on Visual Studio.

2. The execution sequence will be following this pattern where the code will be executed in the
stepsgiven below.
 Importing the Essential Packages.

 Reads the Datasets which contains 15 columns and 373 entries, some of the important columns
are MRI_ ID, Gender, Age, Education, Socio-Economic Status[SES], Mini-Mental State
Examination [MMSE], Clinical Dementia Rating [CDR], Estimated Total Intracranial Volume
[eTIV] and Whole Body Vibration [nWBV].

 Replacing ‘?’ with NaN (Not a Number) and Plots the Heatmap visualizing the No. of NaN’s in
the data.

3. We Select dementia_dataset.csv data as our target values and splitting data into Train and Test
datasets.

4. Installing and Importing the Naïve Bayes, Random Forest, and Decision Tree Algorithm to classify
the dataset for training.

5. Training results will be tested and predicts the output which is our main objective i.e. Predicting
whether the person is demented or not.

6. Finally, we are plotting the heatmap to visualize correlation of attributes and Bar plots of different
attributes.

Dept. Of ISE, EWIT 2023-2024 Page-17


Dementia Disease Prediction

CHAPTER 7
SNAPSHOTS

Fig 7.1: Distribution of ‘group’ by ‘gender’.

The above figure illustrates the distribution of group by gender among the patients in the
dataset.

Fig7.2: Distribution of education for each ‘gender’ and ‘group’.

The above figure illustrates the distribution of education for each gender and group among the
patient’s data.

Dept. Of ISE, EWIT 2023-2024 Page-18


Dementia Disease Prediction

Fig7.3: Heatmap to visualize correlation of attributes.

The above figure illustrates the heatmap to visualize the correlation of attributes present in the
dataset.

Fig 7.4: Distribution of age for each group.

The above figure illustrates the distribution of age for each group (Demented group / Non-
Demented group) given in the dataset.

Dept. Of ISE, EWIT 2023-2024 Page-19


Dementia Disease Prediction

Fig7.5: Classification of education.

The above figure illustrates the classification of education provided in the dataset.

Fig7.6: Classification of age.

The above figure illustrates the classification of different age of patients provided in the
dataset.

Dept. Of ISE, EWIT 2023-2024 Page-20


Dementia Disease Prediction

Fig7.7: Classification of clinical Dementia rating.

The above figure illustrates the classification of Clinical Dementia rating for each patient in the
dataset.

Fig7.8: Classification of mini-mental state examination.

The above figure illustrates the classification of Mini-Mental State Examination scores of each
patient provided in the dataset.

Dept. Of ISE, EWIT 2023-2024 Page-21


Dementia Disease Prediction

Fig. 7.9: Home Page of Dementia Disease Prediction

The above figure illustrates the home page of dementia disease prediction application developed using
Django.

Fig. 7.10: Application page with Input Values.

The above figure illustrates the application page with user inputted values.

Dept. Of ISE, EWIT 2023-2024 Page-22


Dementia Disease Prediction

Fig. 7.11: Predicted Output indicating that the person is demented.

The above figure illustrates the model predicted output indicating that the person is demented.

Fig. 7.12: Predicted Output indicating that the person is not demented.

The above figure illustrates the model predicted output indicating that the person is non
demented.

Dept. Of ISE, EWIT 2023-2024 Page-23


CONCLUSION

In conclusion, this proposed system aimed to develop a predictive model for dementia
disease based on a comprehensive analysis of relevant data. We successfully gathered
a diverse dataset containing various personal and clinical features of individuals.
Extensive data preprocessing and cleaning were performed to ensure the quality and
reliability of the dataset. We have used Random Forest Classifier prediction model to
determine whether a person is affected by dementia disease or not. The ability to do so
provides a strong predictive tool for online healthcare platform to have a better
understanding of their patients and to improve their services accordingly. We hope that
this proposed system will be helpful for AI and ML researchers and medical practitioners
who are working in the domain of automated diagnostic systems for dementia prediction.
REFERENCES
 Github - The dataset and csv files were downloaded from here.
 Concepts of Python were learnt from
 https://www.w3schools.com/python/python_ml_getting_started.asp
 https://www.freecodecamp.org/learn/machine-learning-with-python
 https://www.geeksforgeeks.org/machine-learning-with-python/
 Execution of the code and commands were done on Microsoft Visual Studio.

You might also like