You are on page 1of 5

Early Detection of Alzheimer’s Disease

Jayant Harwalkar Hemankith Reddy M Kamatchi Priya L


Department of Computer Science Department of Computer Science Department of Computer Science
PES University PES University PES University
Bangalore, India Bangalore, India Bangalore, India
jayanthharwalkar@gmail.com hemankith@gmail.com priyal@pes.edu
Shravani Neelesh S
Department of Computer Science Department of Computer Science
PES University PES University
Bangalore, India Bangalore, India
shravanim1250@gmail.com neeleshsamptur@gmail.com
Abstract—Alzheimer’s disease (AD) is a neurological developed machine learning and deep learning model
disorder incurable but medication and management strategies.
Machine learning models play an important role in early which predicts AD risk. The parameter can be found
diagnosis of Alzheimer’s. The Machine learning model require using techniques like support vector machine (SVM),
a huge dataset for better performance in terms of accuracy Random Forest, Voting classifiers and Gradient Boosting.
and computational complexity. Different Pre-processing
techniques are employed for improving the same. This paper [5]
focus on pre-processing technique which trains the ML models Despite the currently available bio markers, electronic
with fewer samples and improve both accuracy and healthcare data, health records, and increase in
computational complexity.
digitization of data, there is not enough information on
Index Terms—component, formatting, style, styling, insert
how to use this large-scale health data in the prediction
of AD risk, but there are few studies that demonstrate
I. INTRODUCTION that potential AD risk can be predicted when these
resources are combined with data-driven machine
Alzheimer’s disease(AD), a brain disorder, is learning models.[6]
considered to be a form of dementia[2] that slowly
The subsequent sections are divided into literature
destroys memory, thinking, and eventually, the ability to
review, proposed method, implementation details and
perform daily tasks. It is caused by the loss and
conclusion. In proposed method we have described the
degeneration of neurons in the brain mostly in the
architecture and discussed about the model. The
cortex region. AD is caused by the formation of plagues
implementation details includes the brief set out of the
in which clumps of abnormal proteins are formed
data set, prepossessing and winded it with the
outside the neuron which block the neuron connections,
comparative analysis with respect to different models.
disrupting signals, which leads to impairment of the
brain. AD can also be formed by tangles in which a build-
up of protein occurs inside the neuron which affects the II. LITERATURE REVIEW
signal transition. In AD, the brain starts to shrink, the Considering the amount of research that is made,
gyri become narrow while the sulci widen. The risk of machine learning techniques, branch of artificial
getting this disease increases with age and it is mostly intelligence are widely used to this. Even with the
seen in older people like at an age of 60[1]. existence of all these methods, there are no instruments
Having an early detection improves the chances for for the detection. However, certain, physical,
effective treatment and the ability of the individual to neurophysiological, phycological, neurological tests can
participate in a wide variety of clinical trials. Treatment be used for identification of this disease.[7]
is effective if given in the early stages. Currently, there SVM was heavily researched for both feature
are no treatments more effective to cure as the proper selection and modelling. There are many variants of
reason of the disease is still unknown[2][3] . SVM used for classification, for example it can be done
AD can be detected by performing scans like magnetic using SPECT images which uses SPECT perfusion imaging
resonance imaging(MRI), computed tomography(CT), or to classify the healthy patients’ images from those
positron emission tomography(PET)[4]. Researchers use having AD. The approach is based on linear
raw MRI brain scans, demographic images, and clinical programming formulation based on linear hyperplane
data to compare it with normal cognition by using a which performs simultaneous feature selection and
classification. This method has specificity of 90% and D convolutions. The second method is using transfer
specificity of 84%, this is also proven to be better than learning techniques like VGG19 pre-trained model.
Fisher Linear Discriminant (FLD) and also statistical Standard CNN contains feature extraction, feature
parametric mapping (SPM) and also better than human reduction and classification so there is no need to do
experts.[8] Using SVM to find Atrophy patterns in AD as extraction manually. The weights in initial layers act as
a feature selection is also proven to be one of the best feature extractors and their values can be further
methods. This method classifies whether the patient has improved by iterative learning[9]. CNN was used as a
AD or not based on the anatomical MRI. Even though feature extractor for Decision tree, SVM, K means
this approach provided good results on Cohort 1, the clustering, and Linear Discriminate classifiers are
results weren’t great for inter-cohort as the accuracy applied. The classification accuracy was seen to be
dropped to 74%. This showed that the selected regions improved by including a pre-processing method before
of considered refined atlas did not have good CNN models[14].
generalisation ability[9]. SVM was also used for binary A deep learning architecture with a SoftMax
classification using LIBSVM toolbox under MATLAB(the regression layer and stacked sparse auto-encoders was
Scikit-learn library can also be used for python to also used to develop early diagnosis technique for
implement SVM(SVC-Support Vector Alzheimer’s disease. The autoencoder learns the input
Classification)).SVM is less sensitive to the representations. By selecting the highest predicted
dimensionality of the problem and hence allows working probabilities for each label, the SoftMax regression layer
with complex problems that involve a large number of classifies instances. The Accuracy, Sensitivity and
variables. The Radial Basis Function kernel was chosen Specificity of the model turned out to be 87.76%,
as it offers good asymptotic behaviour. However, the 88.57%, and 87.22% for classification of AD vs NC and
results in some conditions might be nontrivial[10]. showed increase in accuracy compared to convention
Decision trees is also one of the most popular methods such as SVM [15]. Another approach
methods as well. The ID3 Decision Tree, along with implements feature extraction of brain voxels from grey
measures like Entropy and Information Gain has been matter and classify using the CNN algorithm. Voxels are
used in this research. At each node in the decision tree, enhanced with a Gaussian filter, and unnecessary tissues
the attribute with the highest information gain is chosen are deleted from enhanced voxels. And then CNN
as the splitting attribute[11]. algorithm is used for classification and this method
PSO is one of the well-known methods for optimising achieved an accuracy of accuracy of 90.47%, 92.59%
feature extraction as well as classification. Optimisation precision, and recall of 86.66% in comparison to some
algorithms like GA (Generic programming) for feature system which uses physician decision.[16]
selection, PSO (Particle Swarm Optimisation) for
III. PROPOSED METHODOLOGY
performance optimisation and ELM (Extreme Learning
Machine) for classification, VBM (Voxel Based
Morphometry for feature extraction, along with ELM A. Architecture
and PSO classified can be used to identify the class of AD As shown in Fig. no. 1, the MRI scans from the
among the three classes. Training and testing accuracy database are fed to a pipeline which consists of a series
were 94.57% and 87.23% respectively for GA-ELMPSO of pre-processing techniques. PSO performs feature
algorithm over 10 random trials[12]. PSO for feature selection on the pre-processed images. The resultant
reduction along with Decision Tree Classifier for images will be stored in a database and will be used by
classification achieved an accuracy of 91.24% while the PSO to get optimal parameters of the Convolutional
sensitivity was 91.24% with specificity being 93.10%. In Neural Network. This produces an optimized
this method, feature reduction by PSO gave a reduced architecture for CN. The CNN model is trained, validated
set of variables instead of original data and finally and tested.
classification is done using decision tree technique there
B. CNN parameter optimisation using PSO
are many parameters are to be found which takes a lot
of time and energy and process of noise reduction is The process of training is repetitive and continued
difficult as the images were degraded.[13] until the stop criteria is met. The steps with reference to
CNN was also prominent as prediction and Fig. 1 to optimize PSO are:
classification model. The classification is done using two 1) Feed the pre-processed images to as input to the
methods and the first one is building the CNN CNN network. The images should be of the same
architecture from scratch based on MRI scans 2D and 3
size and characteristics. For example, they should of
the same dimensions, scale, colour gamma, etc.
2) Design of PSO parameters is shown in Fig.1 as well.
The algorithm’s particle population is generated.
This involves setting the values of number of
particles, number of iterations, inertial weight,
social constant, cognitive constant etc., Random
values can be set or can also be set according to
some heuristic
3) With the parameters obtained by the PSO,
parameters of CNN are initialised (parameters to be
set are given in the table below). The CNN is ready
to be trained now.
4) Training and validation of CNN. The CNN reads,
processes, validates and tests the input images. This
step produces values for the objective functions. Fig. 1. Architecture
The objective functions are AIC and Recognition
rate. These values are returned to the PSO.
5) Calculate the objective function. The objective 6) PSO parameters are updated. Both, the position of
function is calculated by PSO to obtain the optimal the particles and the velocity of the particles that
values in the search space. characterize these particles, are updated by
considering their own optimal position (Pbest) and
the optimal position of the entire swarm in the
search space (Gbest).
7) This process continues until the end criteria is met.
The end criteria can be the number of iterations or a
threshold value.
8) It is then determined which architecture is optimal.
Here the Gbest particle represents the optimal
architecture.
To elaborate further on how the algorithm will work,
an example is presented. The particle structure can
consist of 8 positions as shown below. Each particle has
these 8 positions and each of these positions are
responsible for tuning one hyper-parameter. The hyper-
parameters to be optimized are given in the below
table.

TABLE I: Structure of Particle


X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
TABLE II: Example Particle Structure
Particle Hyper-Parameter Search Space
Coordinate
X1 Convolution layer no. [1, 4]
X1 Filter no. (1st layer) [32, 128]
X3 Filter size (1st layer) [1, 3]
X4 Filter number (2st layer) [32, 128]
X5 Filter size (2st layer) [1, 3]
X6 Filter number (3rd layer) [32, 128]
X7 Filter size (4th layer) [1, 3]
X8 Filter number (4th layer) [32, 128]
X9 Filter size (5th layer) [1, 3] following is done on the MRI image as a part of this
X10 Filter number (5th layer) [32, 128] pipeline
X1 coordinate controls the hyper-parameter for (i) Post-Acquisition Correction: Scanners with
convolution layer number. If X1 = 4, it means that there different acquisition parameters provide
will be 4 convolution layers. X2 and X3 control the considerable hurdles. Small changes in
hyper-parameters filter number and size respectively. If acquisition parameters for quantitative
X2 = 32 and X3 = 2, it implies that there will be 32 filters sequences have a significant impact on machine
of size 5x5 (1 is mapped to 3x3, 2 to 5x5, 3 corresponds learning models, thus rectifying these
to 7x7, 4 implies 9x9). Similarly, X4 and X5 control filter inconsistencies is critical.
numbers and size for layer 2. The same goes for all the (ii) B1 Intensity Variation: B1 errors are one of the
remaining coordinates. X10 represents the batch size for problems in measuring MTR which expands to
training. magnetization transfer ratio since this MTR
value changes with change in the magnetization
TABLE III: Example particle generated by algorithm transfer (MT) pulse amplitude. These errors can
4 100 2 64 2 64 3 96 1 32 also be caused due to nonuniformity in the
IV. IMPLEMENTATION DETAILS radiofrequency and incorrect transmitter output
A. Dataset settings when accounting for changing levels of
RF coil loading. These mistakes need to be
The data was obtained from the Alzheimer’s Disease
corrected in order to obtain images with no
Neuroimaging Initiative (ADNI)database
variations and loss of crucial data.
(adni.loni.usc.edu). The ADNI is a long-term study that
uses indicators such imaging, genetic, clinical, and (iii) Intensity Non-Uniformity: The quality of
biochemical markers to follow and detect Alzheimer’s acquired data can be affected by intensity non-
disease early. The ADNI data repository has around 2220 uniformity. The term ”intensity non-uniformity”
patients’ imaging, clinical, and genetic data from four refers to anatomically unrelated intensity
investigations (ADNI3, ADNI2, ADNI1 and ADNI GO). The variance in data. It can be caused by the radio-
image data (MRI scans) was used. ADNI provides frequency coil used, the acquisition pulse
researchers with as they work on progression of sequence used, and the sample’s composition
Alzheimer’s disease. and geometry. As a result, it is critical to correct
ADNI has ground-breaking data-access policy, which this variation, and a variety of approaches have
makes data available to all scientists worldwide without been offered to do so.
restriction. We have acquired 2294 ADNI1 1.5T MRI (II) Registration: The act of aligning images to be
scans which are in the NiFTI format. The images are pre- analysed is called registration of image, and it is a
classified into CN, MCI or AD. Each of the images are of critical phase in which data from several images
the shape 192 x 192 X 160. must be integrated everywhere. They can be taken
at various times, from various perspectives, and
with various sensors. Registration in medical
B. Pre-processing
imaging allows you to merge data from multiple
The pre-processing steps are: modalities, such as CT, MR, SPECT, or PET, to get a
(I) ADNI pipeline full picture of the patient. In our case, since the
MRI scans are taken from different angles, it is the
(II) Registration
process of geometrically aligning all the images for
(III) Segmentation and Normalization
further analysis. It can be used to create
(IV) Skull Stripping correspondence between features in a set of
(V) Smoothing images and then infer correspondence away from
(I) ADNI pipeline: The input to the pipeline are the those features using a transformation model.
MRI images in the dataset are obtained from MRI (III) Segmentation and Normalization: The division of
machines, and the machines use magnetic waves brain tissue, which can be split into tissue sections
and radio waves to produce the scans. There are such as cerebrospinal fluid (CSF) which cushions
parameters such as radio frequency, magnetic the brain, grey matter (GM) where the actual
frequency and uniformity of the coil which can processing is done and white matter (WM) which
cause variations in the MRI scans. To correct such gives communication between different GM areas,
variations in the images, ADNI pipeline is used. The is the major focus of the brain magnetic resonance
imaging (MRI) image segmentation approach (CSF). [7]https://scihub.se/https://doi.org/10.1109/
Various significant brain regions that could be ICOEI48184.2020.9142975
useful in identifying Alzheimer’s disease are found [8] https://sci-hub.hkvisa.net/10.1007/s10115-006-0043-
and kept during image segmentation. 5
Normalization is the process of shifting and scaling [9] https://link.springer.com/chapter/10.1007/978-3-
an image so that the pixels have a zero mean and 540-79982-5_14#citeas
unit variance. By removing scale invariance, the [10] https://pubmed.ncbi.nlm.nih.gov/12816571/
model can converge faster. [11]
(IV) Skull Stripping: Skull stripping is a process where in [12]https://ieeexplore.ieee.org/document/6583856
the skull and the non-brain region of the image is
[13]https://ieeexplore.ieee.org/document/7019310
removed and only the brain portion of the image is
[14] https://arxiv.org/abs/2101.02876
retained as we deal with only this region for
[15]https://ieeexplore.ieee.org/document/6868045
analysis of Alzheimer’s disease. Skull stripping is
one of the first steps in the process of diagnosing [16]https://onlinelibrary.wiley.com/doi/abs/10.1002/
brain disorders. In a brain MRI scan, it is the ima.22553
method for differentiating brain tissue from non- [17]
brain tissue. Even for experienced radiologists,
separating the brain from the skull is a time-
consuming process, with results that vary widely
from person to person. This is a pipeline that only
needs the input of a raw MRI picture and should
produce a segmented image of the brain after the
necessary pre-processing.
(V) Smoothing: Smoothing involves removing
redundant information and noise from the images.
It helps in easy identification of trends and
patterns in the images. When the image is
produced in an MRI machine, it consists of
different kinds of noise which needs to be removed
in order to obtain clean image without loss of any
crucial information.
REFRENCES
[1]
https://jamanetwork.com/journals/jamaneurology/articl
e-abstract/584730
[2]
https://
molecularneurodegeneration.biomedcentral.com/
articles/10.1186/s13024-019-0333-5
[3]https://www.alz.org/alzheimers-dementia/
diagnosis/why-get-checked#:~:text=An%20early
%20Alzheimer's%20diagnosis%20provides,and%20may
%20provide%20medical%20benefits
[4]https://www.nia.nih.gov/health/how-alzheimers-
disease-diagnosed#:~:text=Perform%20brain%20scans
%2C%20such%20as,other%20possible%20causes%20for
%20symptoms
[5]
https://www.frontiersin.org/articles/10.3389/fpubh.202
2.853294/full
[6]https://www.nature.com/articles/s41746-020-0256-0

You might also like