You are on page 1of 406

Copyright 2023, Sunrise International Education Inc.

Publisher
Primedia E-launch LLC
PO Box 2727, Orlando, Florida, 32802, United States

ISBN 979-8-89238-890-0

The Horizon Academic Research Program


A Project of Sunrise International Education Inc.
Email: Contact@HorizonInspires.com
Website: www.horizoninspires.com

The Horizon Academic Research Journal is published by Sunrise


International Education Inc. Volume 4, No. 1 was released in 2023.

Copyright ©2023, by Sunrise International Education Inc., 641 S St NW, Ste. 300,
Washington, DC 20001. All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written permission of
the publisher.
TABLE OF CONTENTS

Image Classification of Stars and Galaxies Using Different Machine Emma Leifer 1
Learning Models

Life and Death: The Effect of Biases and Heuristics on Medical Rajeev Krishnamurthy 27
Decision Making

Mesenchymal Stem Cell-Derived Exosomes and Their Therapeutic Dorsa Arbabha 39


Potential on Parkinson’s Disease

Life Challenges Faced By Chinese Workers In Africa Hengyi Chen 67

Unlocking Quercetin’s Therapeutic Potential: The Use of Innovative Andrew Lee 79


Drug Delivery Strategies to Remedy Neurodegenerative Disorders

Art Therapy’s Effectiveness and its Role in Treating Neurological Simryn Patel 108
Conditions

Detecting Causality by Using Alexander Quandles and Alexander- Nikhila Pasam 129
Conway Polynomial

Community Detection in Dynamic Face-to-Face Interaction Iroda Ibrohimova 137


Networks: A Louvain Algorithm Approach

Self-Supervised Dementia Prediction From MRI Scans With Zile Huang 150
Metadata Integration

Sexual Dimorphic Nature of the Amygdala and its Contribution to Nardos Shewadeg Gebresenbet 165
Females’ Susceptibility to Depression
Black Sea Grain Initiative: A Game-Theoretic Analysis Paari Dhanasekaran 179

Automated Pneumonia Detection From Chest X-ray Images Using Lurvïsh Polodoo 192
Machine Learning

The Effects of Classical Music Intervention on the Neuropsychiatric Nyneishia Janarthanan 236
and Cognitive Mechanisms of Alzheimer’s Disease Patients

Comparing the Effectiveness of Support Vector Classifier and Dania Ali 265
Stochastic Gradient Descent in Hate-Speech Detection

The WHO Is Not The Global Health Government States Think It Is Dilay Kuyucak 275

Targetting the EGFR Pathway in Glioblastoma Multiforme: A Ananya Bharathapudi 285


Review of Current Pre-clinical and Clinical Trials with Tyrosine
Kinase Inhibitors

Detecting Distributed Denial Of Service Attacks (DDoS) Using Isha Singhal 303
Machine Learning Models

Can Behavioural Economics Help Explain Gender Disparities in Jumaina Fatima 334
Labour Markets?

Using Data-Efficient Image Transformers for Diabetic Retinopathy Veda Fernandes 345
Severity Classification

Using Behavioral Economics Insights to Determine the Likely Baraka Muhoza 364
Causes of the High Rate of Unemployment in Refugee Camps and
What Can Be Done to Alleviate It

Are Champions Born Or Made? Yashvendra Singh 378


Foreword
Our world in 2023 is in many ways only beginning to reconnect, unlike only
a few years ago as countries erected barriers to travel, trade, and large gatherings of
people. Yet in other ways, the world became more interconnected than ever during
this period of supposed isolation, as many thinkers both young and old took to
online tools and communities as they overcame barriers to professional growth and
intellectual exploration. Even in our social distance, we came closer together. This
new interconnectedness enables us to continually unlock the boundless potential of
human ingenuity through the power of collaboration and mentorship. It is in this spirit
that The Horizon Academic Research Program connects talented high school students
from all over the world with scholars and university faculty to conduct meaningful
academic research and rigorous intellectual inquiry. The Horizon Academic Research
Journal showcases a sampling of the high quality work produced by high school
students in The Horizon Academic Research Program.
This journal issue represents only a fraction of the diversity and excellence
of our students from the past year. Students from over 25 countries, representing
languages and traditions from Vietnam, Qatar, Gambia, Austria, India, Turkey,
Canada, the United States, China, South Korea, and more, worked with researchers
from among the world’s foremost universities to produce scholarship on topics
ranging from Bioinformatics to International Relations. By conducting the program
online, we were able to cross physical boundaries that would have otherwise made
such work impossible, and connected curious minds that likewise never could have
found one another.
So far in 2023, nearly 5000 students have applied to join the Horizon Academic
Research Program. Of these applications, only 21 students’ research interests are
represented in this edition of our journal, a figure representing less than half of 1% of
our total applications. Some students produced original research, while others filled
gaps in the current literature using knowledge gained over the course of their time at
Horizon Academic.
Despite being written by high school students, these papers display a level
of thoughtfulness, analytical rigor, and ambition more typical of undergraduates.
Some papers are approachable introductions to fields that are difficult for the average
person to access, while others cause us to question our understanding of ideas with
which we may think we are familiar.
We are pleased to make this volume available to the public and hope that this
provides a window, not only into the skill and ability of each author, but also into the
abilities of high school students to do meaningful research without the use of physical
lab spaces. It is our pleasure and honor to share this work with you.

The Horizon Academic Research Program


Image Classification of Stars and Galaxies Using
Different Machine Learning Models
Emma Leifer∗†
October 25, 2023

Abstract
The correct classification of astronomical objects - such as stars and
galaxies - is essential to the field of astronomy. Today, however, with
the advent of powerful next generation telescopes, the quantity of im-
ages being collected far exceeds the amount that can be catalogued by
astronomers through their own observations and analyses alone. As just
one example, the Large Synoptic Survey Telescope (“LSST”), opening in
August 2024, will catalogue around 40 billion images of stars and galax-
ies. Certain machine learning models have proven efficient and accurate in
classifying astronomical images. In this paper, we tested the ability of four
different machine learning models to classify images of stars and galaxies
accurately without inputs of additional measurements of the brightness,
size, or shape of stars or galaxies: a convolutional neural network (“CNN”)
model, a logistic regression model, a random forest classifier, and a small
neural network. We discuss and compare the architecture and perfor-
mance of each model. We found that our neural network model trained
on a data set preprocessed using a data preprocessing technique known
as Principal Component Analysis (“PCA”), performed the best achieving
an accuracy of 84 percent out of sample. We thus demonstrate that using
such machine learning models can be an effective way to classify images of
stars and galaxies, substantially reducing the time required to catalogue
them.

1 Introduction
Sir William Herschel, the famous astronomer and composer, wrote in 1789,
“The method I have taken of analyzing the heavens, if I may so express my-
self, is perhaps the only one by which we can arrive at a knowledge of their
construction” [Her89]. Herschel catalogued the objects he observed in the night
sky, providing “a short description of each nebula or cluster of stars, as well
as its situation with respect to some known object” in the hope that he would
∗ Advised by: Guillermo Goldsztein of the Georgia Institute of Technology
† Trinity School, New York, New York

1
“engage the attention of Astronomers,” and “induce them to undertake the
necessary observations” [Her89]. Herschel’s work led to the understanding that
the stars he observed make up the Milky Way [Cea20a]. Today, the correct
classification of stars, galaxies, and quasars remains essential to the field of as-
tronomy. The advent of powerful telescopes, however, has provided astronomers
with vast amounts of data on enormous numbers of astronomical objects, orders
of magnitude more than they can catalogue through their own observations and
analyses. As just one example, the Large Synoptic Survey Telescope (“LSST”),
opening in August 2024, will catalogue around 40 billion images of stars and
galaxies [Cea20a]. Moreover, classification of images of stars and galaxies based
on morphology alone – according to which “unresolved” point sources are clas-
sified as stars, and “resolved” sources for which a shape can be determined are
classified as galaxies – has led to inaccurate classifications [Cea20a]. Using this
approach, images of quasars, galaxies with an extremely luminous center region,
particularly if they are very distant, are sometimes mistaken for stars [Cea20a].
The point source/resolved image classification is also no longer sufficient given
that the newest telescopes capture more and more unresolved images of very
faint distant galaxies [Fea12].
To reduce misclassification of stars and galaxies in scenarios where images are
morphologically similar, differences in their spectral energy distributions (SEDs)
can be utilized. In particular, stellar emission spectra usually peak around a
particular frequency, while the spectra emitted by galaxies are more evenly
distributed [Facteb]. Methods based on SEDs have been effective but have
certain limitations, including that they do not incorporate information about
the data set, such as the expected relative numbers of stars and galaxies [Fea12].
The use of statistical models to distinguish stars from galaxies has emerged as a
very useful supplement to methodologies based on image morphology and SEDs.
Machine learning techniques can be used to construct statistical models based
on an existing dataset with known classifications [HR20, Imp23, Kim18, KB16,
Oea92]. Our study focuses on using a subclass of machine learning techniques
known as “supervised machine learning,” to classify stars and galaxies, which
we show to be a method for efficiently and accurately analyzing the reams
of data generated by telescopes such as LSST to classify stars, galaxies, and
quasars [Iea19, Kim18, Sea15, Sowte].
In this paper, we describe several different machine learning algorithms that
can be used to classify images of stars and galaxies without prior feature extrac-
tion. In other words, our models are trained using 3986 64 x 64 pixelated images
of stars and galaxies without inputs of additional measurements of the bright-
ness, size, or shape of stars or galaxies. We built four different models to use
as classifiers: a neural network model, a convolutional neural network (“CNN”)
model, a logistic regression model, and a random forest classifier. In our paper,
we discuss and compare the architecture and performance of each model. We
found that our neural network model trained on a data set preprocessed using
a technique known as Principal Component Analysis (“PCA”) performed the
best, achieving an accuracy of 84 percent out of sample.

2
1.1 Stars and Galaxies
A star is a massive ball of hot gas [oAfEtec, Geateb]. Typically, stars consist
primarily of hydrogen and helium gas, with small amounts of other elements
– many stars are about 73 percent hydrogen gas, 25 percent helium gas, and
2 percent other elements [Fac23]. Individual stars have their own unique life
cycles, which can range from a few million years to billions of years [Geateb,
ALMte]. Stars are categorized and differentiated based on factors such as their
mass, temperature, and brightness. The ages and compositions of stars in a
galaxy can provide information about the history, dynamics, age, and evolution
of the galaxy [Chi01]. A galaxy is a massive cluster of gas, dust, and billions of
stars with their respective solar systems, all bound together by gravity (a “solar”
or “star” system is a group of astronomical objects like planets or meteors that
orbit a star) [ED23, oAfEteb, Geatea].
The problem of classifying stars and galaxies is complicated by the fact
that images of certain galaxies, those with particularly bright regions at their
centers, resemble stars. The extremely bright centers of these galaxies, with a
brightness that cannot be accounted for by the presence of stars alone, are called
“active galactic nuclei” or “AGNs” [Hubte, Pogte, Tel21]. The brightest AGNs
are known as “quasars,” which have a massive black hole at their centers [KS98,
Pogte, ESAte, VBea01]. Gas and dust falling into the central black hole emit
electromagnetic radiation as they are subjected to the extreme gravitational pull
of the black hole [ESAte]. Quasars, which can be thousands of times brighter
than the entire Milky Way galaxy, are some of the brightest objects in the
universe [ESAte]. As a result, distant quasars can be mistaken for stars, even
using modern machine learning techniques [oVSOte,Cea20a,Sch63]. The images
below show quasars misidentified as stars in a recent machine learning study,
when the algorithm was applied to previously catalogued images in the test data
set [Cea20b].

Figure 1: Images of quasars from recent machine learning study that were mis-
classified as stars [Dea01].

1.2 Machine Learning


In this paper, we will explain machine learning models that we built to classify
images of stars and galaxies. Machine learning models can extract information
from a data set automatically, with less preprocessing of the data by the user.
In particular, machine learning models can carry out a subset of preprocessing

3
known as “feature extraction” or “feature generation.” For example, an image
might be pre-processed by running it through a set of filters to highlight the
boundaries or edges, before the image is fed into the model. For a machine
learning model, this preprocessing would not be necessary.
It is helpful to define several terms in introducing our machine learning
model. A data point, also known as an “example,” is a collection of input values,
known as “features,” and an output value, known as the “target variable” or
as a “label” in a classification problem, similar to an (x,y) coordinate, where
the “x” can have a number of values, i.e., (features (x), target variable (y)).
In our model, the input value is an image of a star or galaxy, the features are
the pixels in that image, and the target variable or label is the classification of
the image as a star or galaxy. The data points or “examples” in our model are
the classified images. “Parameters” are numerical weights used in the model to
combine the features (input values) to produce the output value. Parameters tell
the relevance of certain characteristics of the image, such as, in our example,
the brightness in a particular location or the strength of the boundary in a
particular direction in each pixel in the image.
“Hyperparameters” are features of the model selected in advance by the user
to determine the mathematical relationships among the inputs into the model,
and the definition of the error that should be minimized by the model. In
essence, the user determines in advance the definition of the optimal outcome
being sought. The selection of hyperparameters can also determine the number
of parameters used by the model. In general, models with fewer parameters are
preferred to those with more parameters because models with more parameters
are more likely to be “overfit” to the training data, i.e., they will not generalize
well to unseen data. Models with too many parameters will find patterns in
data sets that are not actually relevant to predicting the target variable.
Initially, the data are divided into three parts – the “training set,” the “val-
idation set,” and the “test set.” The “training set” is the subset of data used
to fit the model – meaning to determine the optimal parameters or weights to
predict the correct target variable, i.e., label for the image. Once we have a
trained model, possibly a set of trained models with different hyperparameters,
we need to determine the best model to use and assess its performance on un-
seen data. To select the best model, we run the “validation set” through each
of our trained models. From this process, a set of error metrics or scores is
obtained that is used to select the model that performed the best according to
those scores. Finally, the performance of the model must be assessed on com-
pletely unseen data – the “test set.” The performance on the test set should
be representative of the model’s true performance going forwards. A test set
is necessary in light of the fact that the error metrics on the training data are
biased because the model was fit to the training data, and the validation errors
are biased because we selected the best performing model on the validation set.
Machine learning can be “supervised,” “unsupervised,” or “semi-supervised,”
depending upon how much information about the data set is provided to the
model. For example, in a “supervised” machine learning model, the user pro-
vides the model with a training set that includes both the features (e.g., images

4
of stars and galaxies) and the target variable (classification of the images as stars
or galaxies). In an “unsupervised” model, the user provides the model with a
training set that includes only features (no target variable) and asks the model
to differentiate the objects in the data set based on the most relevant features.
In a “semi-supervised” machine learning classification model, the training set is
composed mostly of unlabeled data with a small amount of labeled data.

2 Related Works
Our research is related to several prior studies applying machine learning to
star-galaxy classification. In 1992, S. C. Odewahn et al. were the first to use
neural networks – a machine learning method – to classify astronomical images
as stars or galaxies [Oea92]. They achieved successful classification rates that
varied with size of the images. Our research is also related to a 2016 study by
Edward J. Kim and Robert J. Brunner on star-galaxy image classification using
CNNs [KB16]. Kim and Brunner used CNNs to classify 48 x 48 pixelated images
of stars and galaxies captured by the Canada-France-Hawaii Telescope Lensing
Survey. Kim and Brunner demonstrated that CNN models can accurately clas-
sify images of stars and galaxies without extra data extracted from each image
by experts. Kim and Brunner also compared the performance of CNN models
to that of Trees for Probabilistic Classifications (“TPC” models). To minimize
overfitting (discussed in Section 4), Kim and Brunner used data augmentation
and regularization techniques. To increase the amount of data in their training
set (data augmentation) they created rotations, reflections, translations, and
blurrings of original images. They used a regularization technique known as
“drop out.”
Another related study was conducted by Edward J. Kim in 2018 at the Uni-
versity of Illinois [Kim18]. Kim lays out multiple machine learning techniques
that can be used to classify images of stars and galaxies. Kim discusses the
use of a Bayesian combination technique to improve the performance of any
single classification method. This study demonstrates further that CNNs per-
form accurately as classification models. Finally, this study shows that a semi-
supervised machine learning classification model performs well when a relatively
small amount of labeled data is available. More recently, in 2020, Ryan Hausen
and Brant E. Robertson developed Morpheus, a machine learning model that
simultaneously detects and classifies objects in astronomical images [HR20].

3 Images, Dataset, and Preprocessing


3.1 Devasthal Fast Optical Telescope (DFOT)
The images in our dataset were captured by the 1.3-m Devasthal Fast Optical
Telescope (DFOT) in the observatory of the Arybhatta Research Institute of
Observational Sciences (ARIES) in Nainital, India [ARItea,ARIteb,Sea11]. The
datasets used in this paper were derived from a project at ARIES [Agr21].

5
The ARIES conducts research in astronomy and astrophysics and uses optical
telescopes on site to gather empirical data. At ARIES, research on star clusters,
stellar variability, and the nuclei of active galaxies and related phenomena is
carried out using the DFOT observational data.
The DFOT is located on a mountain peak in Naintital, India, known as
the Devesthal (“Abode of God”). The Devesthal offers particularly good astro-
nomical viewing conditions as a result of its high altitude (approximately 2450
meters) and distance from urban areas, which helps to minimize light pollu-
tion. Astronomical “seeing” is determined by the amount of disruption to light
caused by fluctuations in the refractive index of the earth’s atmosphere. The
location of DFOT atop the Devesthal allows for sub-arcsec seeing – the ability
of the DFOT to resolve details smaller than 1 arc-second (1/3600 of a degree).
Similarly, the high altitude of the site reduces “extinction,” the scattering or
absorption of light by particles between the source and the telescope, allowing
more light to reach the DFOT [oAfEtea].
The DFOT, which was installed in 2010, has two charge-coupled device
(“CCD”) cameras that capture images of the sky. A CCD camera is a very
small microchip that contains a grid of elements that sense light (each of which
is called a “pixel”) [Obste, Dea01]. One of the DFOT’s CCD cameras captures
images of 2048 x 2048 pixels, and the other captures images of 512 x 512 pixels.
When the light collected by the telescope is focused onto the CCD, each pixel
is assigned a number that corresponds to a shade of grey (color images are
obtained when several pictures are taken – through a red, green, and blue filter
– and are then overlayed). Figures 2 and 3 below show images from our dataset
of galaxies and stars, respectively, taken by the DFOT:

Figure 2: Images of galaxies from our dataset taken by the DFOT [Agr21].

Figure 3: Figure 3: Images of stars from our dataset taken by the DFOT [Agr21].

6
3.2 Dataset
We obtained our dataset on Kaggle, where it is publicly available – Kaggle
is an online platform through which users can find and publish datasets on a
wide range of topics. The images in our data set were taken by the DFOT’s
CCD camera with a 2048 x 2048 pixel grid. The original images – 2048 x 2048
pixels in size – were reduced to 64 x 64 cutouts, where each cutout showed a
single astronomical object. Image segmentation was then performed to isolate
the astronomical object (star or galaxy) in each image. Image segmentation
splits up an image into different regions based on characteristics of the image in
those regions. Image segmentation helps pinpoint objects or boundaries in the
image making analysis more efficient; it also makes the process of finding the
coordinates of each isolated object easier [Labte].
After image segmentation, the coordinates of each isolated object were then
identified and inputted into the Sloan Digital Sky Survey (SDSS) to find the
corresponding label of star or galaxy for each image. The SDSS is a publicly
available database that contains three-dimensional maps of the universe, with
catalogues that contain both photometric and spectroscopic data [Surtea, Sur-
teb, Surtec]. Photometric data are obtained when light passes through multiple
colored filters [Factea]. Spectroscopic data are obtained when light comes in
contact with a “dispersive element” (i.e., a prism) [Sowte]. Researchers can use
the SDSS database to identify astronomical objects by inputting the coordinates
of their objects.
Cutout images in our dataset were separated into two directories – a “star”
directory and a “galaxy” directory – based on SDSS classification after their
coordinates were inputted into the SDSS. Our dataset was composed only of
the 64 x 64 pixel cutout images.

3.3 One Hot Encoding


At this stage, we also applied a process called “one-hot encoding” to the labels
of our dataset. One-hot encoding converts categorical labels in a classification
problem into a numerical format that can be inputted into a model. More
specifically, in our classification problem, we assigned 0’s to all images that
belonged to the “star” category and 1’s to all images that belong to the “galaxy”
category. Figure 4 below shows the for loop that does this assignment.

3.4 Data Summary Statistics


We split our data set randomly into three subsets: a training set (60 percent), a
validation set (20 percent), and a test set (20 percent). These percentages, 60-
20-20, are standard for machine learning models. Our full data set is composed
of 3986 images: 3044 images of stars, and 942 images of galaxies. After splitting
our data into the three subsets, the distribution of images was the following (as
shown in Table 1 below): our training set had 2391 images (1808 stars and 583

7
Figure 4: The for loop that assigned 0’s to “star” images and 1’s to “galaxy”
images.

galaxies), our validation set had 797 images (607 stars and 190 galaxies), and
our test set had 798 images (629 stars and 169 galaxies):

Table 1: Division of full data set into training set, validation set, and test set.

3.5 Balancing the Data Set


As set out above in Section 3.4, we found that our training set consisted of
1808 images of stars and only 583 images of galaxies (this ratio of about 3:1
was expected, given that the make-up of our original dataset was also about
3:1 stars to galaxies). To prevent bias in our model towards images of stars,
we decided to “balance” the images in the training set so that the model would
equally learn the properties of stars and galaxies.
Balancing a data set is the process of making our training data have each
class represented equally when training the model. Techniques for balancing
include duplicating images in the training data (over-sampling) until the training
set has equal numbers of each class, and assigning greater weight to the error
of less represented classes.
In our study, we balanced the training data by over-sampling images of
galaxies (selected at random from the training set) until the training set included
the same number of stars and galaxies (1808 of each). Table 2 below shows the
numbers of stars and galaxies in the balanced training set.

8
Table 2: Division of the full data set, showing numbers of stars and galaxies in
the balanced training set.

3.6 Scaling the Data


The last step we took before training our model was scaling the data. Scal-
ing (“normalizing”) the data reduces numerical values and makes computation
faster and easier. Because pixels contain values only between 0 and 255 (each
represents a shade of gray at a point in an image), we normalized our data by
dividing each value by 255 to obtain values between 0 and 1 (see Figure 5 below
for the code):

Figure 5: Code for normalizing the data.

4 Convolutional Neural Networks


4.1 Neural Networks
Neural network models make up a subset of machine learning and are the basis
for a branch of machine learning known as “Deep Learning.” Neural networks
were designed to simulate the human brain and the way the human brain learns
and processes information. Although there are many different types of neural
networks, all neural networks have a similar overall structure. In Sections 4.2
and 7, we discuss in detail two types of neural networks that we used – a
convolutional neural network (“CNN”), and a small neural network.
A neural network is a model that not only learns how to transform the in-
put data into an output prediction, but also learns how to apply successive
transformations to the input data to learn new relationships among the input
features that were not pre-specified. Each successive transformation of a fea-
ture, whether that feature is a raw input feature, or a transformed version of
that feature, is known as a layer. Having these multiple layers allows the neural

9
network to find higher order patterns and structures in the data, because the
neural network can recombine transformations it has done in prior layers into
more complicated features. For example, a neural network could start by iden-
tifying edges, and then slowly determine that a particular combination of edges
represents a star or a galaxy.
Within each layer, each input feature, and successive transformations of that
feature, is referred to as a “node” in a neural network – “nodes” can be thought
of as corresponding to neurons in the brain. The layer with the raw features,
the first layer, is known as the “input” layer. The intermediate layers, which are
the transformations of the raw features, are known as the “hidden layers.” The
final layer, which contains the predictions, is known as the “output layer.” The
output layer in a classification problem will have nodes that correspond to the
probability of each category. “Deep learning” algorithms are neural networks
that contain multiple hidden layers.
The number and types of layers used, as well as the number of nodes in
each layer, are all considered “hyperparameters,” and are determined by the
programmer before the model is trained. Once the hyperparameters have been
set and the model has been fit, each node represents a combination of the
features that went into it. Layers other than the input layer use an “activation
function” to compare the input data to a threshold value – the input layer does
not require an activation function simply because all input data must be passed
on to successive layers for a model to develop. If the data a node receives
are below a threshold value, the node will not pass the data on to the next
layer – essentially the neuron will not fire. Figure 6 below shows a physical
representation of a deep learning neural network with its different layers.

Figure 6: Physical representation of a deep learning neural network, showing


the different layers [IBMte].

10

10
4.2 Convolutional Neural Network (CNN)
For one of our models, we used a type of deep learning neural network known
as a convolutional neural network (CNN). CNNs, which are designed to process
pixel data from images, contain a specific architecture that is used for image
processing and recognition. CNNs contain a minimum of two hidden layers – a
convolutional layer and a pooling layer (discussed below in Sections 4.3 and 4.4)
– that are used to process and extract patterns in image datasets. In addition to
the convolutional and pooling layers, it is very common for CNNs also to have
one, if not several, “dense” fully connected layers (discussed below in Section
4.5).
In our study, we tested the performance of CNNs with and without a dense
layer, and also the performance of CNNs with different numbers of nodes in their
dense layer. To build our CNN model, we drew on existing libraries of code
publicly available online (shown in Figure 7 below along with other libraries
used in this project).

Figure 7: Code from publicly available libraries.

4.3 Convolutional Layer


A convolutional layer processes subsets of contiguous pixels throughout the im-
age. By combining processed versions of the image, the convolutional layer
eventually learns to see the image. The reason that a convolutional layer is
particularly useful in image classification is that the network is structured such
that it considers only the relationship of pixels that are nearby to one another
– considering separate pixels at opposite ends of an image is unlikely to find
useful relationships. The size of the contiguous region considered is known as
the “kernel size.” In a CNN, “convolution” means applying a filter or a kernel
to an input or feature map [Bae23]. Figure 8 below shows a “convolution” being
computed over the entire image.
To analyze a single image, different convolutions can be done with different
filters to gather specific data. For instance, different filters can be used to detect

11

11
Figure 8: A convolution being computed over an entire image [Bae23].

horizontal edges versus vertical edges in an image. Each image inputted into a
model will undergo a convolution with one or more filters to construct multiple
transformed images, known as “feature maps.” The kernel size of a convolutional
filter as well as the number of feature maps produced are hyperparameters in a
CNN. For our model, we set our convolutional layer to include 32 feature maps.
In section 4.4 below, we discuss how different kernel size affects a CNN model’s
performance. Additionally, for the activation function of our convolutional layer,
we used the “Relu” (“Rectified Linear Unit”) activation function. Relu is a
piecewise linear function that returns zero for inputs less than or equal to zero,
and returns the input exactly if it is positive. Figure 9 below shows a graph of
the Relu function:

Figure 9: The ReLu function [Bro20].

12

12
4.4 Max Pooling Layer
In CNN models, a pooling layer is a hidden layer that is placed after a convo-
lutional layer. Pooling layers are used to reduce the number of parameters in a
model, thereby decreasing a model’s runtime and likelihood that it experiences
overfitting. There are two types of pooling layers: “max” pooling layers and
“average” pooling layers. A max pooling layer is generally used in CNN models
that will be used for object recognition; it is useful for identifying distinctive
features in an image such as edges and corners. However, an average pooling
layer is generally used in CNN models that will be used for object detection
and image segmentation. Pooling layers reduce the number of parameters in a
CNN model by reducing the dimensions of feature maps generated by the con-
volutional layer. A max pooling layer, similar to a convolutional layer, scans its
input, a feature map, taking the maximum value from different regions of the
map. Figure 10 below illustrates the way a max pooling layer works.

Figure 10: Output of a max pooling layer, showing reduction of the dimensions
of the original feature map [Kho21].

A filter size of 2 x 2 indicates that the max pool will survey a 2 x 2 region
of the feature map and take only the maximum value. A “stride size” of 2 in
a given direction indicates that the max pool filter will move two units right or
two units down to survey another region. We built a max pooling layer and set
the stride and filter, both hyperparameters, to have a size of 2 x 2.

4.5 Dense Layer and Flattening Layer


Unlike a convolutional or pooling layer, a “dense” layer is “fully connected” –
all input nodes are connected to all output nodes. In CNN models, as discussed
in section 4.2 above, it is very common for there to be a hidden dense layer
after a convolutional and pooling layer. Outputs from the convolutional and
pooling layers are used as inputs by the dense layer. Because the nodes in the

13

13
dense layer are connected to every node in the preceding layer, the dense layer
is able to find relationships between separate parts of the image. In our model,
we included a single hidden dense layer before our output layer.
Because dense layers can receive inputs only in the form of one-dimensional
arrays, before adding a hidden dense layer in our model, it was necessary to
include a “flattening layer.” A flattening layer will compress multi-dimensional
arrays that describe images into single dimensional arrays.

4.6 Output Layer and Sigmoid Activities Function


The final layer, also known as the output layer, of a CNN must be a fully
connected or dense layer that is responsible for classifying the images input into
the model. The number of nodes contained in the output layer corresponds to
the number of categories in the classification problem: if the original dataset
has n categories of labels, and if n is greater than 2, the output layer will have
n nodes. An exception is that if n=2, the output layer will have only one
node. In other words, if the original problem is a binary classification problem,
the output layer will have one node; if the original problem is a multi-class
classification problem with n categories, the output layer will have n nodes. An
output layer of a CNN classifies an image by returning a vector of size n, where
n corresponds to the number of categories in the dataset. Each element of the
vector represents the probability of the input image belonging to that class. The
output probabilities are computed by applying an activation function known as
the softmax function to each of the n values of the output layer. In the case
of binary classification, the softmax function is equivalent to the “sigmoid” or
“logistic” function.
Since we were working with a binary classification problem (images can only
be stars or galaxies), we structured our model so that the output layer had only
one node that applied the sigmoid function to the final output. The sigmoid
function gives an output that is between 0 and 1 and thus can be used for binary
classification problems by assigning a decision boundary. Figure 11 below shows
a graph of the sigmoid function with a decision boundary of 0.5. A 0.5 decision
boundary indicates that a probability less than 0.5 would result in a prediction
that an image does not belong to a certain class while a probability greater than
0.5 would result in a prediction that it does.

4.7 Neural Network Optimization


Training these models is an iterative process that starts with a collection of ran-
domly chosen weights and parameters. From those randomly chosen weights, we
produce an initial set of predictions, which are then scored to tell the model how
to improve the weights to improve future predictions. The scoring mechanism
is known as the “loss function.” We continue this process of making predictions,
scoring the predictions, and updating the weights until the parameters converge
– meaning that the change in parameters is under some prespecified threshold
– or until we have run a prespecified number of iterations.

14

14
Figure 11: Graph of the sigmoid function for our model, showing a decision
boundary of 0.5 [Sah21].

Different types of loss functions are used in different contexts. For predicting
continuous variables or outcomes, we often use the mean squared error, while in
classification problems, “cross-entropy loss” is commonly used. In problems with
two outcomes, this function is called the “binary cross-entropy loss function.” If
there are multiple outcomes, the categorical cross-entropy loss function is used.
Because we have a binary problem (predicting stars or galaxies), we used the
binary cross-entropy loss function, shown in Figure 12 below.

Figure 12: Binary cross-entropy loss function equation [Kum23].

For binary classification problems, each data point (or image in our case)
exists in one of two possible classes (a class associated with 0 or with 1, as
discussed in Section 3.3 above). In the equation in Figure 12, y corresponds to
the target class label (0 or 1), and p corresponds to the predicted probability
of class 1. As demonstrated by this equation, the loss function is minimized
when the model has the highest possible probability associated with the correct
output.
After we have chosen a loss function, we need to select a method for the
neural network to update itself once it knows the scores produced by the loss
function. To optimize our model, we used the “Adam” optimizer (“Adaptive
moment estimation”) – which applies a technique called stochastic gradient

15

15
descent to the weights to improve them.
After creating the baseline architecture for our CNN model, we tested the
performance of different CNN models with different combinations of hyperpa-
rameters. We adjusted kernel size, the number of nodes in our dense layer, and
tested the performance of a model without any hidden dense layer at all.

4.8 Accuracy of CNNs with Different Hyperparameters


In our study, the first combination of hyperparameters in our CNN included
a kernel size of 4 and a dense layer with 128 nodes. After training this CNN,
we learned that a model with these hyperparameters performed well, with its
validation set having an accuracy of about 80 percent. However, despite its high
accuracy measure on its validation set, this CNN framework caused the model to
have around 3.6 million parameters. As discussed in Section 1.2 above, models
with very high numbers of parameters relative to a small amount of training
data are more likely to experience overfitting. We therefore tested if we were
able to reduce the number of parameters by reducing the number of nodes while
still achieving a high accuracy on our validation set. We tested the performance
of models with 32 and 64 nodes in their hidden dense layer. We found that the
performance of these models was not significantly different, as all models had an
accuracy of around 80 percent. This finding suggests that using fewer nodes is
favorable because it reduces the number of parameters in the model while still
keeping the accuracy high.
Additionally, we tested the performance of a CNN model with no hidden
dense layer, and found that it actually performs marginally better than CNN
models with a hidden layer. It achieves about an 81 percent accuracy on its
validation set. We only tested neural network models with a single or no hidden
dense layer because including more than one hidden dense layer would increase
the number of parameters greatly and likely lead to overfitting. Table 3 below
shows the relative performance of CNNs with varying numbers of nodes in their
hidden layer.

Table 3: Relative performance we obtained for CNNs with varying numbers of


nodes in their hidden layer.

Next, using a hidden dense layer with 64 nodes, we tested how adjusting the
hyperparameter “kernel size” would affect the performance of our model. We

16

16
tested the performance of models with a kernel size of 2, 3, 4, 5, and 6. We
found that a CNN model with a kernel size of 6 performs best and achieves an
accuracy on its validation set of 82.6 percent. Table 4 below shows the relative
performance of CNNs with different kernel sizes.

Table 4: Relative performance we obtained for CNNs with different kernel size.

5 Logistic Regression Model


A logistic regression model is one that is applicable only when the problem
involves binary classification. As a result, it uses a binary cross-entropy function,
the same loss function as our CNN models discussed above. A logistic regression
model is equivalent to a neural network without hidden layers and thus no
hyperparameters: it contains only an input layer and an output layer. Because
there are no hyperparameters to tune, we ran a single logistic regression model,
which we found had a slightly less successful performance relative to our CNN
models – it achieved an accuracy of about 73 percent on its test set. Overall,
the lower accuracy of our logistic regression model was not surprising because
a logistic regression is a simpler model than a neural network and contains
only 4096 parameters. Nonetheless, despite the smaller number of parameters
relative to the neural networks examined earlier, the logistic regression was
still significantly overfit to the training data. It had a 100 percent accuracy in
sample, but out of sample was accurate only 73 percent of the time. Figure 13
below shows the code used to build our logistic regression model, and Table 5
below shows its performance and number of parameters.

Figure 13: Code used to build our logistic regression model.

17

17
Table 5: Performance of our logistic regression model and number of parameters
in the model.

6 Random Forest Classifier (RFC)


A Random Forest Classifier (RFC) is another type of supervised learning that
can be used to classify images. An RFC consists of multiple “decision trees.”
A detailed discussion of RFCs is beyond the scope of this paper. We focus
instead on the hyperparameters associated with RFCs that we adjusted, and
the performance of our RFC models on star-galaxy image classification.
An RFC, like a neural network, has many different hyperparameters. In
our study, the hyperparameters that we adjusted were the number of trees
and the minimum samples per leaf. We found that our model with 100 trees
and 8 minimum samples per leaf performed best with a validation and test set
accuracy of 79 percent and 136,000 parameters. Below, Tables 6 and 7 show the
performance of RFC models with different combinations of hyperparameters.
Table 6 shows performance with a fixed leaf size of 1 when the number of trees
is adjusted, and Table 7 shows performance with a fixed number of 100 trees,
when the minimum samples per leaf is adjusted.

Table 6: RFC performance with minimum samples per leaf fixed at 1, when the
number of trees is adjusted.

7 A Small Neural Network Model with Princi-


pal Component Analysis (PCA)
After creating the models discussed above, we analyzed the effect of decreasing
the number of input features and therefore decreasing the number of parameters

18

18
Table 7: RFC performance with a fixed number of 100 trees, when the minimum
samples per leaf is adjusted.

needed to train the model. To do so, we used a data preprocessing technique


known as Principal Component Analysis (PCA). In PCA, the number of input
features is reduced by finding the components that best summarize the data.
Specifically, these components make up a set that captures the greatest amount
of variance in the input data.
In other words, if each data point has k features, there will be k principal
components that fully represent that data point; however, by using PCA, a data
point can be represented by a subset of those k features that account for most of
the variance in the dataset. In fact, most components capture only minor varia-
tions or are essentially noise, and only a small number of components are needed
to reconstruct each data point. As a result, with PCA, the most representative
components of the data are selected and used, rather than every input feature.
Using a smaller number of features enables models to be constructed and fit
with fewer parameters than would have been possible without PCA. Reducing
the number of input features is desirable because models with fewer parameters
are less likely to overfit to the training data. As discussed in Section 1.2 above,
the fewer parameters a model has, the less likely it is to be overfit.
After applying PCA to our data, we used the 50 most representative compo-
nents out of 4096 components (4096 because a 64 x 64 image has 4096 pixels),
which captured about 30 percent of the variance in our data. Figure 14 below
shows a graph representing the “cumulative variance” and “explained variance
by component.” The blue line, “Cumulative Variance,” describes the total vari-
ance that is represented by k (which is 50 here) components, whereas the orange
line, “Explained Variance By Component,” describes the amount of variance
represented by each kth component.
We then trained a much smaller neural network model using a dataset that
had been preprocessed with PCA. Our neural network contained a single hidden
layer: a dense layer with four nodes. Using the first 50 principal components, our
small neural network model performed better than our other models described
above: our model achieved an 84 percent accuracy on its test set while having
only 209 parameters. In addition, this model was significantly less overfit than
the other models we tried. While the other models performed significantly

19

19
Figure 14: Graph showing the “Cumulative Variance” and “Explained Variance
By Component” obtained after we applied PCA to our data.

worse on the test data on than the training data (100 percent training accuracy,
compared to 80 percent test set accuracy), the training set accuracy (88 percent)
of this smaller model was much closer to the accuracy of the validation and test
sets (85 percent and 84 percent respectively). Table 8 below summarizes our
results.

8 Conclusion and Future Work


A comparison of the models we tested showed that our neural network model
trained on a dataset preprocessed with PCA performed the best in classifying
images of stars and galaxies, having an accuracy of 84 percent for the test set.
Although our CNN model with a kernel size of 6 x 6 and 64 nodes in its hidden
dense layer performed well – with 82 percent accuracy – it had a large number
of parameters relative to the number of images in our dataset (this model had
1,725,985 parameters and was trained on only 2391 images). As discussed above,
reducing the number of parameters a model has decreases the model’s run time
and makes it less prone to overfitting.
To increase the number of images the model is trained on using our data
set, perhaps different data augmentation techniques could be tried – such as
SMOTE, blurring, or rotation. Future research could be directed to testing the
effectiveness of our neural network/PCA model in classifying other astronom-

20

20
Table 8: Summary of the results of the neural network model applied to our data
after preprocessing with PCA, which gave the best results of the four models
we tested.

ical image datasets. Additionally, future work could include analyzing images
that our models misclassified, to determine whether there is a pattern to the
misclassification of certain images by models of totally different structure. It
is possible that by analyzing images misclassified by our models, we might find
that certain images were initially misclassified by the SDSS database. Finally,
the model could be trained and tested on a data set where quasars are distin-
guished as a separate category from galaxies, to determine the ability of the
model to distinguish quasars from stars and galaxies. Such an analysis could
potentially also help address the question whether the misclassifications in our
model resulted, at least in part, from the inability to distinguish stars from
quasars.

References
[Agr21] D. Agrawal. Star-galaxy classification data. Kaggle,
https://www.kaggle.com/datasets/divyansh22/dummy-astronomy-
data, June 12, 2021.

[ALMte] ALMA. Star and planet formation.


https://www.almaobservatory.org/en/about-alma/how-alma-
works/capabilities/star-and-planet-formation, No date.
[ARItea] ARIES. 1.3-m devasthal fast optical telescope.
https://www.aries.res.in/facilities/astronomical-telescopes/130cm-
telescope, No date.

[ARIteb] ARIES. About us. https://www.aries.res.in/about-us, No date.

21

21
[Bae23] Baeldung. What is the purpose of a feature map in a convolu-
tional neural network? https://www.baeldung.com/cs/cnn-feature-
map, May 24, 2023.

[Bol23] D. Bolles. Stars. THE NATIONAL AERONAUTICS AND


SPACE ADMINISTRATION SCIENCE: SHARE THE SCIENCE,
https://science.nasa.gov/astrophysics/focus-areas/how-do-stars-
form-and-evolve, August 28, 2023.
[Bro20] J. Brownlee. A gentle introduction to the rectified linear
unit (relu). https://machinelearningmastery.com/rectified-linear-
activation-function-for-deep-learning-neural-networks/, August 20,
2020.

[Cea20a] A.O. Clarke et al. Identifying galaxies, quasars, and stars with ma-
chine learning: A new catalogue of classifications for 111 million sdss
sources without spectra. Astronomy and Astrophysics, 639, A84,
2020.

[Cea20b] A.O. Clarke et al. Identifying galaxies, quasars, and stars


with machine learning: A new catalogue of classifications
for 111 million sdss sources without spectra. Astron-
omy and Astrophysics, 639, A84, Appendix A, Figure A7,
https://www.aanda.org/articles/aa/fullh tml/2020/07/aa36770 −
19/F 30.html, 2020.
[Chi01] C. Chiappini. The formation and evolution of the milky way. Amer-
ican Scientist, 89(6), 505, 2001.

[Dea01] A.C. Davenhall et al. Overview of ccd detec-


tors. Starlink Project: Starlink Cookbook 5.3,
https://starlink.eao.hawaii.edu/docs/sc5.htx/sc5.htmltoc, 2001.

[ED23] K. Erickson and H. Doyle. How many solar systems are in our galaxy?
National Aeronautics and Space Administration Science Space
Place, https://spaceplace.nasa.gov/other-solar-systems/en/, August
20, 2023.

[ESAte] ESA/Hubble. Quasar. https://esahubble.org/wordbank/quasar/, No


date.
[Fac23] Australia Telescope National Facility. Main sequence stars.
https://www.atnf.csiro.au/outreach/education/senior/astrophysics/
stellarevolutionmainsequence.html, June 7, 2023.

[Factea] Australia Telephone National Facility. Pho-


tometry: Measuring the brightness of stars.
https://www.atnf.csiro.au/outreach/education/senior/astrophysics/
photometrytop.html, No date.

22

22
[Facteb] Australia Telescope National Facility. Spec-
troscopy: Unlocking the secret in starlight.
https://www.atnf.csiro.au/outreach/education/senior/astrophysics/
spectroscopytop.html, No date.

[Fea12] R. Fadely et al. Star-galaxy classification in multi-band optical imag-


ing. The Astrophysical Journal, 760(1), 15, 2012.

[Geatea] C. Gohd et al. Galaxies. https://universe.nasa.gov/galaxies/types/,


No date.

[Geateb] C. Gohd et al. Stars. National Aeronautics and


Space Administration Science Universe Exploration,
https://universe.nasa.gov/stars/basics/, No date.

[Her89] W. Herschel. Catalogue of a second thousand of new nebula and


clusters of stars, with a few introductory remarks on the construction
of the heavens (xx). Philosophical Transactions of the Royal Society
of London, (79), 212-255, 1789.
[HR20] R. Hausen and B.E. Robertson. Morpheus: A deep learning frame-
work for the pixel-level analysis of astronomical image data. The
Astrophysical Journal Supplement Series, 248(1), 20, 2020.
[Hubte] ESA Hubble. Active galactic nucleus.
https://esahubble.org/wordbank/active-galactic-nucleus/, No date.

[IBMte] IBM. What are neural networks?


https://www.ibm.com/topics/neural-networks, No date.

[Iea19] Z. Ivezic et al. Lsst: From science drivers to reference design and
anticipated data products. The Astrophysical Journal 873(2), 111,
March 2019.

[Imp23] C. Impey. Analysis: How ai is helping astronomers study the uni-


verse. PBS, https://www.pbs.org/newshour/science/analysis-how-ai-
is-helping-astronomers-study-the-universe, May 8, 2023.
[KB16] E.J. Kim and R.J. Brunner. Star-galaxy classification using deep
convolutional neural networks. Monthly Notices of the Royal Astro-
nomical Society, stw2672, 2016.
[Kho21] S. Khosla. Cnn: Introduction to pooling layer. Geeks-
forgeeks. https://www.geeksforgeeks.org/cnn-introduction-to-pooling-
layer/, April 21, 2021.
[Kim18] J. Kim. Machine learning approaches to star-galaxy classification.
Doctoral dissertation, University of Illinois at Urbana-Champaign,
2018.

23

23
[KS98] B.K. Kennedy and D.P. Schneider. Quasar discovered with
x-rays is long ago and far away. Pennsylvania State Uni-
versity. https://science.psu.edu/news/quasar-discovered-x-rays-long-
ago-and-far-away, March 26, 1998.

[Kum23] A. Kumar. Mean squared error vs cross entropy loss. Vitalflux.


https://vitalflux.com/mean-squared-error-vs-cross-entropy-loss-
function/, April 3, 2023.
[Labte] Stanford Artificial Intelligence Laboratory. Tutorial 3: Image seg-
mentation. https://ai.stanford.edu/ syyeung/cvweb/about.html, No
date.

[oAfEtea] International Astronomical Union Office of Astron-


omy for Education. Glossary term: Extinction.
https://astro4edu.org/resources/glossary/term/107/, No date.

[oAfEteb] International Astronomical Union Office of As-


tronomy for Education. Glossary term: Galaxy.
https://astro4edu.org/resources/glossary/term/119/, No date.
[oAfEtec] International Astronomical Union Office of As-
tronomy for Education. Glossary term: Star.
https://astro4edu.org/resources/glossary/term/331/, No date.

[Obste] Las Cumbres Observatory. Astronomical cameras.


https://lco.global/spacebook/telescopes/cameras/, No date.

[Oea92] S.C. Odewahn et al. Automated star/galaxy discrimination with


neural networks. Digitized Optical Sky Surveys Proceedings of the
Conference on ’Digitised Optical Sky Surveys’, Held in Edinburgh,
Scotland, 18-21 July 1991 (pp. 215-224). Springer Netherlands, 1992.

[oVSOte] American Association of Variable Star Observers. Bl lacertae.


https://www.aavso.org/vsotsb llac, N odate.

[Pogte] D. Pogosyan. Lecture 27: Quasars and ac-


tive galaxies (agn’s). University of Alberta.
https://sites.ualberta.ca/pogosyan/teaching/ASTRO122/
lect27/lecture27.html, No date.

[Sah21] S. Sahu. Decision boundary for classifiers: An introduction.


Analytics Vidhya. https://medium.com/analytics-vidhya/decision-
boundary-for-classifiers-an-introduction-cc67c6d3da0e, September
26, 2021.

[Sch63] M. Schmidt. 3c 273: A star-like object with large red-shift. Nature


197, 1040, 1963.

24

24
[Sea11] R. Sagar et al. The new 130-cm optical telescope
at devasthal, nainital. Current Science, 101, 1020.
https://doi.org/https://www.jstor.org/stable/24079276, 2011.

[Sea15] M.T. Soumagnac et al. Star/galaxy separation at faint magnitudes:


Application to a simulated dark energy survey. Monthly Notices of
the Royal Astronomical Society, 450(1), 666, 2015.

[Sowte] A. Sowailem. Welcome: Vera c. rubin observatory project. Vera C.


Rubin Observatory Project. https://project.lsst.org/, No date.

[Surtea] Sloan Digital Sky Survey. The sloan digital sky survey.
https://classic.sdss.org/home.php, No date.

[Surteb] Sloan Digital Sky Survey. The sloan digital sky survey.
https://www.sdss4.org, No date.

[Surtec] Sloan Digital Sky Survey. The sloan digital sky survey-v: Pioneering
panoptic spectroscopy - sdss-v. https://www.sdss.org/, No date.

[Tel21] Webb Space Telescope. What are active galactic nuclei?


https://webbtelescope.org/contents/articles/what-are-active-galactic-
nuclei, March 17, 2021.

[VBea01] D.E. Vanden Berk et al. Composite quasar spectra from the sloan
digital sky survey. The Astronomical Journal 122(2), 549, 2001.

25

25
Life and Death: The Effect of Biases and
Heuristics on Medical Decision Making
Rajeev Krishnamurthy 1†

October 17, 2023

Abstract
Doctors’ decision making is affected by a variety of cognitive shortcuts and
biases. Five biases and heuristics extremely relevant to medical decision
making are the availability heuristic, anchoring, sunk cost bias, omission bias,
and status quo bias. By conducting literature reviews involving the analysis
and evaluation of largely quantitative data, this paper analyses these five
biases and the extent to which they affect doctors, as well as the roles they
play in medicine. Finally, this paper recommends a range of policies which aim
to alleviate the negative effects of these heuristics and biases on medical
decision making

1 Introduction
The average doctor-patient consultation takes a mere 18 minutes [ea20a], so it is
perhaps not surprising that misdiagnosis is the largest single cause of adverse
medical events in the USA, accounting for 34% of the country’s total medicolegal
claims [LB11]. The current medical system incentivizes doctors to process the
highest number of patients possible, and, in order to accomplish this, they
unconsciously utilize a number of cognitive shortcuts, or heuristics. While this
does indeed speed up the medical processes, it also makes doctors vulnerable to a
number of errors and biases – heuristics offer the easiest path from a problem to
its solution for the brain, not the most methodical or careful one [ea08]. This can
cause anything from a doctor over-diagnosing epilepsy because he took a course
on it a week earlier, to one continuing an incorrect medical treatment because of
previous investment of time and money into it. Furthermore, the patients whom
doctors are treating may have biases too – which can influence doctors and which
they must compensate for.
This paper will provide an overview of heuristics and biases within the medical
establishment, using both hypothetical and real-world examples to illustrate their
causes and effects. By reviewing a wide range of previous literature, the
mechanisms behind these biases can be explored thoroughly, and various policies

1 Advised by: Dr. Edoardo Gallo of the University of Cambridge †The
International School Bangalore

26
intended to reduce the effects of biases on and increase the accuracy of doctors’
decision-making will be evaluated. Finally, I will provide systemic
recommendations which aim to significantly alleviate the negative impact that
biases and heuristics cause to the medical establishment. This paper will focus on
five cognitive drivers: the availability heuristic, anchoring, sunk cost bias, status
quo bias, and omission bias. This paper will contain four main sections – Section
1 will focus on the availability heuristic, Section 2 will focus on anchoring, Section
3 will cover the sunk cost bias, and Section 4 will focus on the status quo and
omission biases. Each section will consist of three subsections: the first will define
the bias or heuristic covered, the second will be a literature review, and the third
will comprise its implications for medical decision making, and recommendations
which could potentially alleviate its harmful effects.

2 Availability
2.1 Defining the availability heuristic
Li et al. define the availability heuristic as “the tendency to overestimate the
likelihood of events when they readily come to mind”. It is an example of base rate
neglect – a phenomenon that occurs when people tend to ignore statistical
averages in favor of new information [KT73]. A real-word example of this is as
follows: students who were asked to retrieve 12 examples of them expressing
assertive behavior rated themselves as less assertive than students who were
asked to recite 12 examples of their unassertive behavior [ea91] - 12 examples of
the stated behavior were not easily available to the students, leading them to
underrate themselves. In medicine, the availability heuristic could present as a
physician who spent years specializing in tuberculosis being more likely than
generalist peers to misdiagnose similarly presenting disorders as tuberculosis
[ea20b].

2.2 Literature review


This section will analyze two at-scale studies which examined the role of medicinal
availability bias in different contexts - namely the emergency department and
general consulting.
Ly’s study involved emergency departments in 104 Veterans Affairs hospitals
across the US over seven years from 2011 to 2018, where the rate of testing for
pulmonary embolisms was compared before and after diagnoses for pulmonary
embolisms were issued [Ly21]. Ly hypothesized that, as a result of the availability
heuristic, rates of pulmonary embolism testing would increase after a diagnosis of
pulmonary embolism [Ly21]. The study’s scope was limited to patients 21 years
or older presenting with shortness of breath. Multivariate regression was used to
compare testing rates between the 60 days before and after a diagnosis. Ly found
that rates of testing increased significantly by 1.4 percentage points in absolute

27
terms – a relative increase of 15 percent – in the 10 days following a diagnosis
[Ly21]. However, in the following 50 days, no statistically significant change was
found. Ly, however, acknowledged that, due to the study’s 95 percent confidence
elements, an increase below the 5 percent level could not be ruled out [Ly21]. Ly
concluded that “These results are consistent with the availability heuristic
influencing physician decision making in relation to pulmonary embolism
diagnoses”.
Li et al. approached their study with a different method – it involved 46 internal
medicine residents, divided into two groups, with one being the experimental (EG)
and the other the control (CG) [ea20b]. Prior to the experiment, the EG was asked
to analyze an article on dengue fever, and then completed a test on it. The control
group, however, did not receive any of this information and directly participated
in Stage 2 of the study, which occurred six hours later [ea20b]. Li and his
colleagues mention that “great care was taken to ensure that stage 2 appeared to
be an unrelated study” [ea20b]. The participants were presented with and asked
to diagnose eight clinical cases – one of which was dengue fever, three of which
appeared similar to it but were actually different conditions, and the remainder of
which were unrelated to dengue fever. Finally, in the third stage, participants
received three experimental cases and one filler that they had previously
diagnosed and were encouraged to reflect on their previous diagnoses and change
them if they felt they were incorrect in order to test whether reflection would
compensate for availability heuristic-caused errors [ea20b]. Participants were
assigned a score of 1 and 0 for each correct and incorrect diagnosis they made,
and the mean scores of each group were compared.
In the second stage of the study, the CG significantly outperformed the EG in
the experimental cases, 0.80 to 0.66, and slightly underperformed it in the filler
cases, 0.59 to 0.64 [ea20b]. The EG misdiagnosed significantly more cases as
dengue fever than the CG. Additionally, the participants did not show a statistically
significant difference in accuracy after performing reflective reasoning [ea20b]. Li
and his colleagues concluded that “the availability bias seemed to account for the
bulk of diagnostic errors and was not well repaired by reflective reasoning”
[ea20b].

2.3 Implications and recommendations


Both in the emergency room and in the context of consulting, doctors were shown
to be affected by the availability heuristic in a statistically significant manner. It
caused a significant impact over a period of time – a 15% increase in misdiagnosis
sustained over 10 days [Ly21]. Additionally, it is not tied to doctor competence –
the EG the study 2 outperformed the CG in non-affected diagnoses, but
significantly underperformed it when affected by the heuristic [ea20b]. Thus, it
can be concluded that the availability heuristic poses a real danger to the decision
making of doctors, both in the emergency room and during normal consulting
work. An example of this which occurred in 2022 was presented by Kyere, Kwaku,

28
et al. – a man was incorrectly diagnosed with COVID-19 despite three negative
tests, resulting in him being given excessive doses of antibiotics and requiring
supplemental oxygen before being correctly diagnosed and eventually discharged
[ea22b]. They described the availability bias as a “significant contributor to poor
patient outcomes” and encouraged physicians to be aware of it in order to avoid
“inadvertently affecting patient outcomes” [ea22b].
Using reflection and taking additional time to diagnose is not an effective
method against this heuristic, resulting in no statistically significant improvement
in the accuracy of diagnosis [ea20b]. A possible workaround could be to consult
with another doctor who has not seen or diagnosed a recent case of the disease,
as only diagnoses of the exact disease cause availability bias, not ones similar to it
[Ly21]. However, this would likely not be cost-effective, and it might be difficult to
find an unbiased doctor in the case of a common condition, as a result of the
relatively long-lasting nature of the bias [Ly21]. Additionally, as the availability
bias is an example of base rate neglect [KT73], consulting base rates and ensuring
that statistical overdiagnosis is not taking place could be an effective tool for
doctors to mitigate the effects of the availability heuristic.
Finally, the current rise of artificial intelligence could provide the future
possibility of the consultation of neural networks to ensure that opportunities for
differential diagnosis are presented and base rate neglect is avoided [ea21].
Patient details and symptoms reported would be processed by the system, which
would present several diagnoses to the doctor, considering their rates of
occurrence in the general population as well as their likelihood based on patient
history and the symptoms presented. By presenting base rates to doctors, base
rate neglect would be mitigated, and, as a result of the system itself theoretically
not being subject to human heuristics and cognitive shortcuts, the “second opinion”
provided by it would provide an effective antidote to the doctor’s availability bias.
However, this would not be a silver bullet – as mentioned previously, taking
additional time to reflect after diagnosis did not mitigate the bias’s effects on
doctors, so the system would likely need to provide fairly forceful suggestions to
doctors and play a relatively large role in the decision making process in order to
have an impact. Moreover, at the end of the day, systems are merely an aid to
doctors, and a biased doctor will inevitably make biased decisions – while using
AI as an aid might improve the accuracy of diagnoses, systemic training in order
to make doctors less susceptible to the effects of the availability heuristic would
still be necessary.
Additionally, precautions would need to be taken while integrating AI into the
medical decision-making process. It has been demonstrated that, when trained on
biased datasets, AI-based systems can produce biased results [ea19]. However,
measures to alleviate these inherent biases do exist [ea19], and would need to be
integrated into a hypothetical medical system in order to provide relatively
unbiased advice to help mitigate the effects of the availability heuristic upon
doctors.

29
3 Anchoring
3.1 Defining Anchoring
Anchoring was originally described by Kahneman and Tversky, as the tendency of
people to make estimates “by starting from an initial value that is adjusted to yield
the final answer”; this adjustment is “typically insufficient” [TK74]. An example of
this is as follows: participants who were anchored with the value “65” estimated
20% more African countries in the United Nations than participants anchored
with the value “10” [TK74]. Dargahi et al. define anchoring in the context of
medicine as “the excessive weighting of initial information and the inability to
adjust the initial diagnostic hypothesis when further information becomes
available” [ea22a]. A hypothetical example of this could be a doctor misdiagnosing
a patient with depression because the patient seemed depressed to him upon first
impressions, and the doctor did not make a sufficient adjustment away from the
first impression.

3.2 Literature review


This section will analyze two studies associated with anchoring in medicine, in
different contexts – namely the emergency room and general consulting.
A study by Dargahi et al. involved 77 faculty members and residents in
Emergency Medicine [ea22a]. The participants were presented with nine
commonly misdiagnosed written clinical cases and were asked to provide a
diagnosis for each case. Each case was scored on a 1-7 scale on difficulty of
diagnosis by an expert panel [ea22a]. Participants were given a three-page
document and asked to provide a diagnosis after each page – intended to simulate
the gradual acquisition of information in real-world contexts [ea22a]. The study
found that, while the faculty members made far fewer errors overall – 34% as
opposed to a total average of 57% - a much higher proportion of their errors were
anchoring errors – 75% as compared to an overall average of 38% [ea22a].
Dargahi and her colleagues concluded that “The results show that the anchoring
error rate in the faculties is meaningfully higher than in the residents”, and that
while the faculty members were “better than the residents in focusing on the
relevant and related information and generating more links to relate critical cues”,
their diagnostic process was “dominated by heuristic thinking”. They
hypothesized that this could be caused by “their more clinical exposures to
diagnoses in emergency situations”, and that they were “not looking for ways to
strengthen and support their decisions” [ea22a].
A study conducted by Voytovich et al. examined rates of anchoring in students,
residents, and faculty members in Connecticut State University [ea85]. The study
involved participants asked to generate “precise problem lists” for four cases, and
the problems were then judged by independent raters, and mistakes were
categorized [ea85]. The authors found that, while the other errors tracked

30
decreased consistently with increased experience, the frequency of anchoring-
induced errors “seemed... independent of training and level of ability”. They
recommended that “physicians should encourage independent review of their
conclusions and realize that knowledge provides no shield against premature
closure”, and that anchoring might be able to be avoided with “good interrater
ability” [ea85].

3.3 Implications and recommendations


In both situations analyzed, doctors and medical students were found to be
affected by anchoring in a statistically significant manner. Similarly to availability,
it also has the potential to cause poor patient outcomes, as evidenced by a 2021
case presented by Rehana and Huda: a patient with a brain tumor was assumed to
be on drugs by his family and doctors, leading to delayed medical intervention,
misdiagnosis, and ultimately his death [RH21].
What makes anchoring a uniquely dangerous heuristic is the fact that error
rates associated with it do not improve with training and experience – respectively,
the studies reviewed showed an increase in error proportion and no change when
the experience of doctors surveyed increased [ea22a] [ea85]. As a result of other
errors decreasing, inexperienced doctors might universally regard more
experienced individuals as less fallible than themselves – when it is not true with
anchoring. Moreover, the fact that senior doctors’ decision-making processes were
“dominated by heuristic thinking” [ea22a] suggests that the medical
establishment implicitly encourages the adoption of anchoring, which would
require systemic change to fix.
As with availability, consulting base rates could be an effective solution to
alleviate the effects of anchoring and prevent misdiagnosis via preventing
overdiagnosis. Consulting a colleague not involved with the case or a
computerized, A.I. based system would, as mentioned by Voytovich and his
colleagues [ea85], likely also be effective, possible financial and technological
limitations aside. This hypothetical colleague or system, as a result of seeing the
case as a whole from the outset, would not have an “anchor” from which they
would have to adjust and therefore would largely be free from the effects of
anchoring. In the case of A.I, the same limitations mentioned in Section 2.3 would
apply – it would function as an aid, and would not be able to wholly counteract
biased doctors, and the precautions mentioned therein would have to be taken to
ensure an effective implementation which would help alleviate the effects of
anchoring on doctors. Finally, systemic training involving asking senior doctors to
question their initial judgements and evaluate new evidence with higher weight
would likely lessen the effects on anchoring on them.

31
4 Sunk Cost Bias
4.1 Defining sunk cost bias
Bornstein et al. define sunk cost bias as occurring “when a decision maker
continues to invest resources into a previously selected action or plan even after
the plan has proven to be the suboptimal option” [ea99]. This, for example, could
take the form of sitting through a boring movie in order to “get more value” out of
your ticket. This may seem logical; however, by continuing to sit in the movie, you
are impacting your future enjoyment as well. Thus, despite the sunk cost, the best
option is always to switch immediately to the optimal course of action. In medicine,
sunk cost bias could take the form of a doctor continuing a course of ineffective
prescription because their patient has already spent time at their office and money
in buying the medicine.

4.2 Literature review


This section will analyze two studies – one on the side of the patient, and one on
the side of the doctor. Sunk cost has a complex effect on medical decision making,
and analysis must be done from both perspectives in order to evaluate the issue
completely and issue sound recommendations.
A 2010 study by Coleman analyzed the sunk cost effect on university
undergraduates by making them run a computer program, which simulated
spending one of three things – money, time, or effort – in three different quantities
– under, on, or over budget - to book sessions with a chiropractor [Col10]. Then,
an option for a slightly more effective treatment for free was revealed, and the
students decided whether they would cut their losses or invest more in the hope
of the sessions starting to work [Col10]. When the students invested money,
AVANOVA analysis revealed a strong positive correlation overall, with students
who spent more money willing to invest more time into the sessions. Invested time
did not show an effect with AVANOVA, but did have a 90% probability of detecting
a medium effect size when used with power analysis. Finally, previously invested
effort showed the strongest correlation of all with time; future willingness to
invest increased steadily with past effort invested [Col10]. Coleman concluded
that money invested produced a sunk cost effect, while effort produced a similar
relationship but due to a different cognitive mechanism [Col10].
A 2012 study by Braverman and Blumenthal-Barby analyzed the effect of sunk
cost on doctors, and its implications for clinical decision making [BBB12]. The
study involved 389 healthcare providers, who were each given one of four
hypothetical clinical scenarios, and asked to give a 1-5 answer, where 1 was a
strong recommendation to discontinue treatment and 5 was a strong
recommendation to continue treatment. All scenarios involved an unsuccessful
medical treatment but varied in investment. The first scenario involved
investment of money, the second time, the third both, and the fourth neither

32
[BBB12]. The expected result consistent with the sunk cost effect would have been
for the doctors in the scenarios with the most investment recommending
continuation, however, the opposite transpired, and the doctors in the scenario
with no investment were the most likely to recommend continuing the treatment
– an “overcompensation” for the sunk cost effect [BBB12]. In spite of this, 11% of
those surveyed stated that they would recommend continuing the treatment –
which the authors describe as “unrealistic optimism”. The authors hypothesized
that “the participants’ response to the scenario given in the study may not be
reflective of their behaviour when faced with a similar situation in practice” due
to the “close-ended nature of the available responses” and concluded that “further
research is necessary” [BBB12].

4.3 Implications and recommendations


The sunk cost fallacy affects both patients and doctors, which makes it a
particularly tricky problem to solve. With invested time and effort, patient show
clear evidence of a sunk-cost or sunk-cost like effect [Col10], but the evidence in
the case of doctors is much more inconclusive [BBB12]. The vast majority of
doctors seem to overcompensate for the sunk cost effect in theoretical scenarios,
which is not necessarily a negative, as it might provide an effective counter for the
fallibility of patients [BBB12]. Indeed, given the extent of the effect of sunk cost on
patients, it might be the best course of action for doctors to overcompensate to a
greater extent against their own sunk cost effects in order to fight those of their
patients [Col10]. However, their behavior in practice remains unknown, due to a
lack of observational studies [Col10]. Moreover, a significant portion of doctors
still display “unrealistic optimism” in pursuing clearly unsuccessful courses of
action [Col10].
Thus, the only concrete recommendation that can be made regarding sunk cost
bias is for medical establishments to commission research in the area, as based on
current unknowns it is impossible to know the extent to which doctors are affected
by it in the real world. However, there is no downside to conducting campaigns to
both doctors and the general public which promote awareness of the effect, and
how to reduce its negative impact.

5 Omission bias and status quo biases


5.1 Defining the omission and status quo biases
Ritov and Baron define an omission bias as occurring when a decision-maker
prefers a harmful outcome resulting from inaction to a less harmful one involving
an action [RB92]; status quo bias is defined as a preference to maintain one’s state
as opposed to changing it in any way [SZ88]. These biases are closely related in the
field of medicine, and indeed elsewhere; inaction often leads to a worse outcome
than taking action [ea05]. An example of the status quo bias in medicine would be

33
a doctor choosing not to prescribe a patient a new, improved medication as the
patient had been on the previous medication for several years; one of omission
could involve not treating a patient who is having a heart attack as they are being
treated for pneumothorax already.

5.2 Literature review


This section will analyze two studies; one focused on omission bias and one on
status quo bias.
A 2005 study by Aberegg et al. focused on the impact of the omission bias on
medical decision making. The study was conducted on 500 randomly selected
pulmonologists from the Royal College of Chest Medicine, of whom 125 responded
to the survey [ea05]. The study involved the creation of two pairs of case vignettes,
which contained one option relating to keeping a status quo, and one with a course
of action involving either action or omission depending on the form [ea05]. In the
first case described, participants were almost twice as likely to pick the same
option when it was presented as an omission as opposed to an action [ea05]. The
second case also showed a trend nearly as strong, but the third did not – which the
paper hypothesizes may be due to the “perceived psychological burden of the
decision it involved”. The study concluded that pulmonologists “may be
susceptible to cognitive biases such as omission and status quo bias” and that the
“suboptimal decisions” made as a result of this could have “far-reaching
implications for patient outcomes, cost-effectiveness, clinical practice variability,
and medical errors.” [ea05].
A 2021 study by Camilleri and Shah focused on analyzing the effect of the
status quo bias on physicians and the general population in Australia. It involved
giving 302 physicians and 733 non-physicians three scenarios: one related to
medicine and two unrelated to it [CS21]. There were two versions of each question;
one with a status quo option and one without it, and participants were randomly
allocated one version. After completing the survey, participants were also asked to
state their confidence in their decision on a five-point scale [CS21]. The results of
the study showed that, in the medical scenario, physicians and the non-medical
population surveyed both exhibited more status quo bias than in the other
scenarios. The physicians, however, proved significantly more susceptible to this
bias in medical scenarios than the non-physicians; they showed a 35% absolute
increase in preference for an option presented as the status quo, while the non-
physicians only showed 18% [CS21]. The authors of the study suggest that this is
because “experts often review a decision made by a prior expert”, which is why it
was not present in “non-expert domains”. The paper cautions that this may lead to
“the treatment patients receive being suboptimal” and states that “it is important
that physicians not fall prey to the status quo bias just because their colleague has
reviewed the patient themselves” [CS21]. In order to reduce the impact of the
status quo bias on medical decision making, the paper suggests making physicians
“effectively ‘blind’ to prior treatment decisions”, or to make primary care

34
physicians “unaware that their first treatment decision will be reviewed by
another”, or, finally, to ask physicians to “consider why the preferred option may
be wrong” [CS21].

5.3 Implications and recommendations


Both omission bias and status quo bias affect physicians to a very significant
extent and are therefore likely propagated by systemic factors within the medical
establishment. If initial treatment plans are correct, these biases do not
necessarily cause any problems – however, if an incorrect diagnosis or
prescription is made in the first place, omission and status quo biases threaten to
keep patients from getting the treatment they need [ea05]. These are potentially
changing lives for the worse every day – take the example of a woman who would
have received an unnecessarily disabling colostomy if not for seeking a second
opinion from a doctor resistant to status quo bias, as presented by Camilleri and
Sah [CS21].
There are several courses of action which could have the capability to reduce
the impact of the omission and status quo biases on medical decision making. As
suggested by Camilleri and Sah, making previous decisions invisible to physicians
would likely remove the impact of these biases, as is the case with consulting an
uninvolved colleague [CS21]. This, however, has the disadvantage of likely adding
significant cost and hassle to the medical process.
Alternatively, integrating artificial intelligence to provide a constant “second
opinion” to doctors could possibly go a long way towards alleviating these biases,
assuming its evolution proceeds at current rates without being impeded by
currently unknown technological limitations [ea21]. This A.I. would analyze the
case without any weight being placed on previous investment into treatment plans,
thereby being free from the omission and status quo bias and being able to
alleviate the effects of these biases upon the doctor. However, the same limitations
mentioned in Section 2 would apply – the AI would not be able to completely
counter a biased doctor and would instead function as an aid, and precautions
would need to be taken during its implementation to ensure the minimization of
systemic bias.

6 Conclusion
While most doctors get the vast majority of their diagnoses and prescriptions right,
the consequences of failure are so severe that any rate of misdiagnosis and failure
to pursue optimal courses of action is too high. In order to make our medical
decision making process as sound as possible, the impact of a variety of cognitive
shortcuts and biases that doctors utilize, such as availability, anchoring, omission
bias, and status quo bias on the process must be minimized through personal and
systemic change. This would be on the parts of patients, doctors, and the medical

35
establishment, and would involve awareness campaigns, doctor training, and
additional measures intended to help doctors make more objective judgements.
However, the impacts of biases like the sunk cost effect remain unresearched and
unknown, and real-world observational studies must be conducted in order to
reveal their effects and develop recommendations.

References
[BBB12] Jennifer A. Braverman and J.S. Blumenthal-Barby. Assessment of the
sunk-cost effect in clinical decision-making. Social Science & Medicine,
2012.

[Col10] Martin D. Coleman. Sunk cost and commitment to medical treatment.


Current Psychology, 2010.

[CS21] Adrian R. Camilleri and Sunita Sah. Amplification of the status quo bias
among physicians making medical decisions. Applied Cognitive
Psychology, 2021.
[ea85] A E Voytovich et al. Premature conclusions in diagnostic reasoning.
Academic Medicine, 1985.
[ea91] Norbert Schwartz et al. Ease of retrieval as information: Another look at
the availability heuristic. Journal of Personality and Social Psychology,
1991.
[ea99] Brian H. Bornstein et al. Rationality in medical treatment decisions: Is
there a sunk-cost effect? Social Science & Medicine, 1999.
[ea05] Scott K. Abergg et al. Omission bias and decision making in pulmonary
and critical care medicine. Chest, 2005.
[ea08] Omar Merlo et al. Heuristics revisited: Implications for marketing
research and practice. Marketing Theory, 2008.
[ea19] Drew Roselli et al. Managing bias in ai. Companion Proceedings of The
2019 World Wide Web Conference, 2019.
[ea20a] Hannah T. Neprash et al. Measuring primary care exam length using
electronic health record data. Medical Care, 2020.
[ea20b]
Ping Li et al. Availability bias causes misdiagnoses by physicians: Direct
evidence from a randomized controlled trial. Internal Medicine, 2020.

[ea21] Silvana Secinaro et al. The role of artificial intelligence in healthcare: a


structured literature review. BMC Medical Informatics and Decision
Making 21, 2021.

36
[ea22a] Helen Dargahi et al. Anchoring errors in emergency medicine residents
and faculties. Medical Journal of the Islamic Republic of Iran, 2022.
[ea22b] Kwaku Kyere et al. Availability bias and the covid-19 pandemic: A case
study of legionella pneumonia. Cureus, 2022.
[KT73] Daniel Kahneman and Amos Tversky. On the psychology od prediction.
Psychological Review, 1973.
[LB11] David Levine and Alan Bleakley. Misdiagnosis: Analysis based on case
record review with proposals aimed to improve diagnostic processes.
Clinical Medicine, 2011.
[Ly21] Dan P. Ly. The influence of the availability heuristic on physicians in the
emergency department. Annals of Emergency Medicine, 2021.

[RB92] Ilana Ritov and Jonathan Baron. Status-quo and omission biases. Journal
of Risk and Uncertainty, 1992.
[RH21] Rita W. Rehana and Najia Huda. A common heuristic in medicine:
Anchoring. Annals of Medical and Health Sciences Research, 2021.
[SZ88] William Samuelson and Richard Zeckhauser. Status quo bias in decision
making. Journal of Risk and Uncertainty, 1988.
[TK74] Amos Tversky and Daniel Kahneman. Judgment under uncertainty:
Heuristics and biases. Judgment under Uncertainty, 1974.


37
Mesenchymal Stem Cell-Derived Exosomes and
Their Therapeutic Potential on Parkinson’s
Disease
Dorsa Arbabha 1†

October 14, 2023

Abstract
Parkinson’s disease (PD) is characterized by the degeneration of
dopaminergic neurons in the substantia nigra, resulting in dopamine depletion
and a spectrum of motor and non-motor symptoms in patients. Mesenchymal
stem cells (MSCs) have garnered attention for their therapeutic potential across
various diseases. They can differentiate into various cell types, including
dopaminergic cells, and secrete neurotrophic and anti-inflammatory factors
with robust neuroprotective properties. In PD, midbrain dopaminergic neurons
express miR-133b, a crucial regulator of tyrosine hydroxylase and dopamine
transporter synthesis. MSCs facilitate interactions with brain parenchymal cells
by transferring miR-133b via exosomes, promoting neurite outgrowth and
functional recovery. Notably, studies have demonstrated elevated dopamine
levels and its metabolites in the striatum of PD rats following treatment with
these exosomes. This review examines mesenchymal stem cell-derived
exosomes, their unique attributes, and their potential as a promising therapeutic
avenue for PD.

Key Terms: Parkinson’s disease, substantia nigra, dopamine, stem cells,


mesenchymal stem cells, exosomes, MSC-EXOs, nanovesicles, microRNAs

1 Introduction

Parkinson’s disease (PD) is a neurodegenerative disease known for its relentless


attack on a specific type of brain cell called dopaminergic neurons in the
substantia nigra. This attack leads to a shortage of dopamine in the brain, causing
a wide range of motor and non-motor symptoms. PD puts a heavy burden on


1 Advised by Dr. Arij Daou, University of Chicago †Cagaloglu
Anatolian High School

38
thousands of patients every year, and there is yet a cure for this disease to be found.
[Elb16]. There remains a demand for new and innovative treatments in the field.

Scientists in the field of regenerative medicine are exploring the therapeutic


potential of mesenchymal stem cells (MSCs). These cells carry a high potency,
being able to differentiate into various types of cells, including dopaminergic
neurons lost in PD patients. Furthermore, MSCs release substances that can
protect neurons and reduce inflammations caused by a variety of neurological
disorders. [Ven17]. This review closely examines the connection between PD and
MSCs, focusing on exosomes derived from mesenchymal stem cells (MSCEXOs).

MSC-EXOs are nanovesicles released by MSCs, containing substances such as


microRNAs, growth factors, and anti-inflammatory molecules. [Yu14]. These
substances make MSC-EXOs specifically great for protecting nerve cells.
Specifically, the microRNA miR-133b plays a crucial role in controlling dopamine
production. In PD, MSC therapies have been shown to act as messengers by
passing miR-133b using MSC-EXOs, which further encourage the growth of nerve
fibers and neurological functions. [Xin12].

This review aims to summarize how MSC-EXOs can be used in treating PD,
explaining the connection and science behind the existing successful studies
regarding them and laying the groundwork for future research and development
in this field.

2 Parkinson’s disease and MSC-EXOs

2.1 Parkinson’s disease

Parkinson’s disease (PD) is the second most common progressive


neurodegenerative disease affecting elderly people, with an estimated occurrence
of 1 to 2 people in a 1000-person population at any given time. The pervasiveness
of PD increases with age, impacting approximately 1 percent of the elderly
population above 60 years. [Tys17]. The disease is initiated by the loss or
degeneration of dopaminergic neurons located in the substantia nigra region of
the midbrain, alongside the creation of Lewy bodies. These Lewy bodies are
associated with abnormal deposits of the alpha-synuclein protein in the brain,
causing damage to the brain’s cognitive abilities and triggering dementia. [niand].
PD risk factors include aging, family history, pesticide exposure, and
environmental chemicals. [Bei14]. PD displays both motor and non-motor
symptoms, ranging from tremors and rigidity to depression and anxiety. Patients

39
with PD also face an elevated risk of developing dementia. Although surgical
procedures like deep brain stimulation (DBS) and various pharmaceutical
therapies have increased in recent decades, there remains a pressing need for the
development of effective and accessible disease-modifying medications.

2.1.1 Clinical Presentation

Currently, there is no conclusive diagnostic test to detect PD, necessitating clinical


criteria for diagnosis. The four main PD symptoms are rest tremor, bradykinesia,
rigidity, and loss of postural reflexes. [Jan08]. These presentations can be used to
distinguish PD from other motor and neurodegenerative diseases. The clinical
presentation of PD can be categorized into motor and non-motor symptoms.
Beyond the motor symptoms, individuals with PD may also manifest a spectrum
of non-motor symptoms, such as autonomic dysfunction, neuropsychiatric
abnormalities, sleep disturbances, sensory abnormalities, anosmia, and dementia.
[Pfe16].

a. Motor Symptoms

A notable slowness of movement in affected patients can describe


bradykinesia. The factors involved with bradykinesia are muscle weakness,
rigidity, tremor, movement variability, and slowness of thought. Patients find it
difficult to control the accuracy and speed of their movements. [Ber01]. Rigidity is
another defining characteristic of PD, defined by a significant, velocity-
independent increase in muscle activity. Patients experience stiffness and
involuntary tightness in their muscles. [Bol20]. Tremor, one of the most prevalent
motor symptoms of PD, is described by uncontrolled shaking in the patient’s body.
In PD, tremors typically occur at rest, a condition known as ‘tremor-at-rest.’
Though tremor begins asymmetrically throughout the body, both sides of the body
can become affected as the disease progresses. [Abu22]. Gait dysfunctions also
emerge during the early stages of PD, including symptoms such as a slowing of gait,
lack of arm swing, shorter steps, postural instability, and reduced trunk
movements. Patients are more likely to be at risk of falls due to increased gait, with
studies showing that 70 percent of PD patients fall at least one time a year, while
39 percent suffer from recurring falls. [Kim18].

b. Non-Motor Symptoms

Although PD is still primarily diagnosed based on motor symptoms,


neuropsychiatric signs and symptoms are increasingly recognized to have equal
relevance in many patients. As a result, PD can now be conceptualized as a

40
complex neuropsychiatric disorder. The indicators and symptoms fall into three
major categories: affect (such as anxiety and depression), perception and thought
(such as psychosis), and motivation (such as impulse control disorders and
apathy). [Wei22]. Another non-motor manifestation of PD is autonomic
dysfunction, which includes gastrointestinal dysfunction, cardiovascular
dysregulation, urine disruption, sexual dysfunction, thermoregulatory aberrance,
and pupillo-motor and tear abnormalities. [Che20b]. The regulation of sleep and
wakefulness depends on the coordinated and highly complex operation of
numerous brain regions and neurotransmitters, many of which have been
demonstrated to be compromised in people with PD. Given this
pathophysiological context, it is not unexpected that sleep and wakefulness
problems are virtually always present in PD patients. [Ste20]. A survey study in
1988 revealed that 98 percent of PD patients had disabilities at night or upon
waking since the onset of their disease, and disturbed wakefulness regulation was
shown to be a prominent feature in up to 30 percent of PD patients. [lee88].
Additionally, sensory symptoms are prevalent in PD, which generally tend to affect
the side of the body that was first or more severely disrupted by the motor
fluctuations. These symptoms include musculoskeletal pain, dystonic pain,
akathisia, CNP, olfactory disturbance, and visual dysfunctions. [Zhu16]. Other
underappreciated non-motor aspects of PD are anosmia, the loss of smell, and
aguesia, the loss of the sense of taste. [Tar17]. Many PD patients also suffer from
cognitive deficits, with a systematic review revealing that 36 percent of newly
diagnosed patients suffer from cognitive impairment. [Aar05]. This condition,
known as Parkinson’s disease dementia (PD-D), is primarily associated with older
age at the disease onset or time of evaluation. [Han17]. In addition to cognitive
dysfunctions, the clinical features of PD-D include behavioral symptoms,
autonomic dysfunctions, sleep disorders, and parkinsonism. [Sez19].

2.1.2 Etiology and Epidemiology

PD stands as the most prevalent neurodegenerative disease after Alzheimer’s. It is


a multifactorial disease influenced by numerous risk and protective variables,
including genetic and environmental factors. While classified as a rare disease, the
incidence of PD is predicted to quadruple by the year 2030, primarily driven by
the aging demographic of the population. [Elb16].

It has been observed that men are more susceptible to PD compared to women.
The meta-analysis results of 7 door-to-door studies show that the ratio of male-
to-female PD cases is 1.49, with a 95 percent confidence interval between 1.24 and
1.95. [Woo04]. Several factors have been proposed as potential explanations for
this gender difference, including the protective effects of estrogens, the higher
frequency and intensity of occupational toxin exposure, more prevalent minor

41
head trauma in males, and recessive susceptibility genes of the X chromosome.
[Wir11a]. Studies also show that the risk of death in PD differs between men and
women, estimated at 2 percent for men and 1.3 percent for women. Overall, the
mortality risk was estimated at 1.7 percent after age 40, and the gender difference
decreased with the increase in age. [Elb02].

Figure 1: Prevalence and incidence of Parkinson’s disease (PD) in France in 2010.


The numbers of patients with prevalent and incident PD were estimated from
nationwide drug-claims databases based on social security systems. [Elb16].

The incidence of PD dramatically rises with age. Though it is uncommon before


the age of 50, both its incidence and prevalence exhibit an upward trajectory in
older age groups. According to a meta-analysis of prevalence studies, PD incidence
increased from 107 cases per 100,000 individuals between the ages of 50 and 59,
to 1087 cases per 100,000 individuals between the ages of 70 and 79. [Pri14]. It
has also been found that younger PD patients have a higher mortality risk than
older PD patients when compared to age-matched PD-free controls. [Pos11]. Heart
disease, pneumonia, and stroke have been recorded as the three primary causes
of death in PD. [Pen10].

Curiously, lower cancer risk in PD patients has also been observed in a


significant body of epidemiological data. This intriguing antagonist connection
between cancer and PD has undergone scrutiny in a meta-analysis involving 29
studies. [Baj10]. One plausible theory to explain this phenomenon is that genetic
factors influencing cell-cycle control may simultaneously guard against cancer

42
and predispose to the development of PD. For example, the Parkin gene, one of
the genes responsible for recessive familial PD, is involved in the formation and
progression of cancer. [Xu14]. [Elb16].

Ten to fifteen percent of PD patients report that their first-degree relatives


have a history of the disease. [Tha08]. Various causative genes have been
identified, and their mutations are typically associated with an earlier age of onset.
However, only a small percentage of patients have a single mutation linked to
Mendelian disease transmission. [Ver15]. Twin studies have discovered poor
concordance rates for the illness, except in cases where it manifests early in life.
Therefore, a significant portion of the time, primarily when the disease does not
display in younger individuals, PD cannot be solely explained by genetic factors.
In contrast, epidemiological and toxicological investigations yield significant
findings regarding the importance of environmental factors. [Tan99]. [Wir11b].

Figure 2: The emerging genetic architecture of Parkinson’s disease. [Cha21].

For over three decades, numerous studies have shown an adverse relationship
between smoking and PD. A meta-analysis of 44 case-control and four cohort
studies has revealed that ever-smokers had a 60 percent lower risk of PD than
never-smokers. [Her02]. Similarly, a study of individual data from eight case-
control studies and three cohort studies from the US corroborated this
relationship. [Rit07]. Remarkably, this inverse relationship has been observed
even among subjects to pesticide exposure. [Gal05]. [Bre16]. According to another
cohort study, the duration of smoking appeared to be more significant than the
intensity of smoking. This finding is important because it shows that several years
of smoking may be necessary before noticing a decreased risk of PD. [Che10a].

43
However, cohort studies of PD patients indicate that smoking has little impact on
the progression of the illness. [Alv04].

According to a meta-analysis of eight case-control studies and four cohort


studies, observational studies have also shown that coffee drinkers had a 30
percent lower chance of developing PD than non- drinkers. [Her02]. This
association was independent of cigarette smoking, and risk reduction increased
with daily coffee consumption. Additionally, a recent cohort study found that
individuals who consumed more than three cups daily exhibited a 40 percent
lower risk of developing PD. [S¨a08].

The idea that PD and pesticide exposure were related first surfaced when many
cases of Parkinsonism followed intravenous MPTP injections in the early 1980s. In
dopaminergic cells, MPTP is converted into 1-methyl-4-phenylpyridinium (MPP+), a
mitochondrial respiratory chain inhibitor with neurotoxic characteristics. The
molecule resembles the chemical structure of paraquat, a nonselective herbicide that
has been in use since the 1960s and remains extensively employed. Given these
observations, numerous research studies have delved into the connection between
farming, pesticide exposure, and PD. [Elb16]. According to a meta-analysis of 46
research, people exposed to pesticides have an approximately 1.6 times increased
chance of developing PD. Among different types of pesticides, herbicides and
insecticides have a stronger connection with PD despite significant study
heterogeneity. While fungicides exhibited a weaker correlation, less research has
investigated their potential link to the disease. [vdM12].

2.1.3 Pathology

The substantia nigra, pars compacta (SNpc), an essential component of the basal
ganglia, is the primary brain region damaged by PD. Dopamine is a necessary brain
monoamine that primarily serves as an inhibitory neurotransmitter, and this
region is predominantly made up of neurons that secrete it. In a healthy brain,
dopamine controls the excitability of striatal neurons, which are essential in
regulating the balance of bodily movement. However, in PD, dopamine levels
decrease, and SNpc dopamine neurons deteriorate. [Ger89]. Low levels of
dopamine result in reduced inhibition of striatal neurons’ activity, allowing them
to fire excessively. This underlying mechanism elucidates why individuals with PD
are unable to control their movements, experiencing tremors, stiffness, and
bradykinesia, which are the hallmarks of PD-related motor symptoms.

44


Figure 3: Neuronal circuits and neurotransmission mechanisms of control in the


brains of normal individuals and those with Parkinson’s disease. a: Neuronal
circuit in basal ganglia in normal brain. b: Degeneration of substantia nigra pars
compacta (SNpc) impair cortico-striatal circuit in PD brain. [Mai17].

Serotonin (5-HT), in addition to dopamine, is a critical substance in the


development of PD. This neurotransmitter is particularly implicated in several
motor and non-motor symptoms, such as tremors, cognitive deterioration,
depression, and psychosis, as well as L-DOP A-induced dyskinesia. [Huo17].

Another neurotransmitter, acetylcholine (ACh), essential for cognitive function,


experiences dysregulation in a number of neurological disorders like PD and AD.
The nucleus basalis of Meybert (nbM), a wide band of cell clusters that are
primarily cholinergic in nature, is located in the basal forebrain subventricular
region. The nbM of individuals with PD, AD, or other kinds of dementia has shown
various patterns of neuronal death, which strongly supports the concept that the
cholinergic system is involved in PD. [Liu15].

Gamma amino butyric acid (GABA) is an inhibitory neurotransmitter that,


directly through GABAergic receptors and indirectly through the astrocyte
network, regulates calcium (Ca++) influx. The Ca++-excitotoxicity and neuronal
death stabilizes both cellular and systemic levels of neuronal activity in the SNpc.
[Hur13]. In contrast, the Ca++-buffering system is regulated by GABA activity.
Approximately 80 percent of patients recently diagnosed with PD exhibit impaired
olfaction caused by dopamine neuron loss in the olfactory bulbs. [Ste12]. Glial cell-
derived neurotrophic factor (GDNF), likewise controlled by the Ca++/GABA
system, regulates the activity of dopamine neurons in the midbrain and the
olfactory system. Additionally, GDNF serves as a potent chemo-attractant for
dopaminergic axons and GABAergic cells. When GDNF was delivered to GABAergic
neurons in the striatum rather than the SNpc, neuroprotective benefits in PD
animal models were seen, indicating that collapse of the GABA/Ca++ pathway is
involved in dopamine-neuronal death in PD. [Iba17]

45
The intracellular buildup of Lewy bodies in dopamine neurons of the SNpc,
which contain misfolded aggregates of alpha-synuclein (SNCA) and other related
proteins, is one of the defining diseases of PD. [Car14]. Several molecular, genetic,
and biochemical studies have shown that post-mortem human brains from
patients with mixed dementia with Lewy bodies (DLB) and PD with dementia
(PDD) who were diagnosed neuropathologically are frequently found to contain a
variety of misfolded protein aggregates, including p-tau, A-beta, and SNCA.
(Stefanis, 2012). According to research, amyloid deposition in some PD patients’
brains has been associated with cognitive reductions without dementia, indicating
that amyloid contributes to cognitive but not motor decline over time. [Iba17]. It
has also been discovered that the load and amount of Abeta pathology influences
cognitive deficits in PDD and LDB. [Bla16]. Alphasynuclein or other misfolded
amyloid proteins can kill neurons by creating a pore in the membrane and
inducing neuroinflammation, excitotoxicity, oxidative stress, and energy failure.
[Mar12]. Oxidative stress has been associated with various general mitochondrial
abnormalities, including variations in the dynamics and shape of the mitochondria,
mutations in the mitochondrial DNA, and abnormalities in calcium homeostasis.
Dysfunctional mitochondria can result in decreased energy production, the
creation of reactive oxygen species, and the activation of stress-induced apoptosis.
[Sub13].

Figure 4: Schematic diagram showing the steps that cause an accumulation of


SNCA. Natural SNCA becomes misfolded under stress and is deposited as
oligomers, small aggregates, or fibrils, which play a significant role in DAneuronal
loss in PD. [Mai17].

Neuro fibrillary tangles, a hallmark of the pathology of numerous


neurodegenerative diseases, including Alzheimer’s disease, frontotemporal
dementia with parkinsonism (FTDP), and progressive supranuclear palsy (PSP),
can be produced by the tau hyper-phosphorylation (p-tau) protein. [SM08]. The
FTDP is associated with chromosome 17 (FTDP-17), and the cortex and SNpc
region exhibit p-tau accumulation. The development of sporadic PD is frequently
linked to the colocalization of the p-tau with Lewy bodies. Similar to FTDP, an
increase in the accumulation of p-tau is brought on by a mutation in the gene
encoding for a microtubule- associated protein (MAPT). [Ari99]. This
accumulation leads to the formation of neurofibrillary tangles, which play a
significant role in the disruption of dopamine-neuronal architecture, ultimately
resulting in fast degeneration and death of dopaminergic neurons, even though

46
they are most closely connected with Alzheimer’s and can co-localize with alpha-
synuclein in Lewy bodies. [Hep16].

2.1.4 Existing Treatments

Drug medication is the most common treatment for PD patients, with patients
receiving different doses of several generic drugs. Although there still is no
definitive cure for PD altogether, some drugs can help slow the course of the
disease or alleviate some of the symptoms. PD drugs are generally categorized as
dopaminergic and non-dopaminergic drugs.

Dopaminergic drugs are typically prescribed by doctors to PD patients in an


effort to raise dopamine levels. Levodopa, or L-DOPA, is a popular dopamine
precursor since dopamine cannot penetrate the blood-brain barrier alone.
Levodopa is particularly effective in lowering “resting tremors” and other main
symptoms, but it cannot restore or replace dopaminergic neurons that have
deteriorated or halt the progression of PD. Common side effects of levodopa
include nausea, vomiting, hypotension, restlessness, drowsiness, or a quick
beginning of sleep. [Bar69]. [Mai17]. Other popular and well-tolerated
dopaminergic drugs include selegiline and rasagiline, also known as MAO-B
inhibitors. The catalytic enzyme monoamine oxidase-B (MAO- B), whose level is
elevated in the brain of PD patients, may be the cause of the decreased dopamine
levels in PD. When used with levodopa, MAO-B inhibitors have been observed to
prolong levodopa’s effects for a year or more. [Rie04]. Similarly, COMT inhibitors,
which inhibit the catechol-O-methyl transferase (COMR) enzyme that is indirectly
responsible for the breakdown of dopamine, can also prolong levodopa’s
effectiveness. Widespread COMT inhibitors, entacapone and tolcapone, also
prolong the viability of levodopa. [Ant08]. Another type of dopaminergic drug is
dopamine agonists. These medications are most helpful in the early stages of PD
and can raise dopamine levels in the brain. They can also be used in the late stages
of PD to extend levodopa’s effectiveness. Pramipexole and ropinirole are two of
the most frequent dopamine agonists used to treat PD patients. However, they are
typically less effective than levodopa in decreasing symptoms and may have a high
list of side effects. [Bro00].

Non-dopaminergic drugs are typically antidepressants used for managing non-


motor symptoms in PD, including depression and anxiety. One of the most popular
medications for treating anxiety in PD patients is benzodiazepine, although it has
some associated side effects. [Che14]. Similarly, clozapine is given to treat
dyskinesia in PD; however, this drug also has many side effects, including
agranulocytosis. [Dur04].

47
The majority of drug treatments as mentioned have significant side effects and
provide only momentarily relief, particularly for particular patient types.

They are also powerless to halt additional dopaminergic neuron loss. Therefore,
some clinicians turn to surgical procedures to lessen the motor symptoms when
drug medication is deemed ineffective, especially in the late stages of the disease.

In PD, a number of basal ganglia nuclei become dormant or dysfunctional. Deep


brain stimulation (DBS), the surgical placement of very small electrodes in these
regions, can be employed to maintain their functional activity. Target areas for DBS
include the thalamus, globus pallidus interna, or subthalamic nucleus, where the
electrodes are placed in one or both hemispheres. These batteries, which can be
appropriately programmed in accordance with the particular requirements of the
PD patient, are what produce the electrical pulses. The implanted batteries can be
checked, changed, or recharged as necessary every three to five years. Many of the
primary motor symptoms of PD can be alleviated by DBS, which reduces the
reliance on levodopa to treat dyskinesias. However, it’s important to note that DBS
must be surgically implanted, which carries risks of infection, speech or balance
issues, stroke or bleeding, and other potential complications. DBS is also
ineffective for addressing psychological, cognitive, or other non-motor problems.
[Her16].

Figure 5: Electrode implantation for Deep Brain Stimulation. [Oku12].

48
As some PD incidences have shown to be caused by multiple genes associated
with the disease, despite most PD cases being sporadic in origin, researchers have
been investigating gene therapy strategies as a potentially viable treatment option.
[Cou12]. Despite efforts, further research and trials are needed in this field to
show the viability of this type of treatment.

A promising method for treating PD involves the transplantation of neural


stem cells into the patient’s brain. Researchers have developed a technique to
produce dopaminergic neurons from mouse embryonic stem cells and transplant
them into the striata of animals with PD. This method involves modulating several
growth factors. Intriguingly, these transplanted neurons in the animal model of PD
survive, integrate into the existing brain circuitry, and reverse the behavioral
abnormalities. [Kim11]. Researchers can produce even more dopaminergic
neurons to transplant into the brains of mice with PD by overexpressing Nurr1, a
transcription factor for developing dopaminergic neurons, in embryonic stem
cells. [Roy04]. A rise in dopamine levels has also been seen after the grafting of
human fetal-derived dopaminergic tissues into the striatum of PD patients,
indicating that the implanted stem cells can survive and develop into
dopaminergic neurons. [Lin11]. Similarly, allogeneic human fetal ventral
mesencephalic (FVM) tissue transplantation in PD patients has shown remarkable
therapeutic advantages. [Hau99].

Mesenchymal stem cell (MSC) transplantation has also been proven in


investigations to ameliorate PD-related motor dysfunctions. In studies conducted
with PD rat models, the systemic infusion of human MSCs resulted in a significant
reduction in the uncoordinated limb movement as observed in behavioral tests. It
was connected to increased dopaminergic neurons and raised dopamine levels in
the striatum of MSC recipients, pointing to a restorative function of MSCs. [Bou08].
Similarly, direct striatal injection of MSCs led to increased locomotor activity,
boosted neurogenesis, and stimulated neuroblast migration in mouse models of
PD. [Off07]. Furthermore, alpha-synuclein transmission has also been shown to be
inhibited by MSC therapy in a PD model. [Oh16].

Before stem cell therapy can be authorized as a viable treatment for people
with PD, additional research must assess its safety and effectiveness. One concern
lies in the self-replicating ability of stem cells, which carries the risk of tumor
formation after clinical transplantation. [Zha23]. In this regard, stem cell
exosomes could be used as an alternative option, as explained in the upcoming
sections of this review.

49
2.2 Mesenchymal Stem Cell-Derived Exosomes

2.2.1 Stem cells

Stem cells are a classification of cells that carry long-term self-renewal abilities
and can differentiate into other cell types that are more specialized within their
functions. Through this differentiation progress, these cells maintain their DNA
structure while exhibiting distinct gene expression patterns in their technical
roles. [Kol13]. Stem cells are distributed throughout nearly every adult organ,
where they are responsible for replacing the cells lost within these organs and
responding to any injury or disease in the tissue. In their differentiation pathways,
there are intermediate or progenitor states. These progenitor cell states can
influence the behavior of the cells surrounding them. Additionally, stem cells can
be engineered and modified in vitro to be differentiated into desired cells.
Leveraging these unique properties, stem cells have been widely researched in
tissue engineering and cell therapy fields. [Bac18].

The potential of stem cells differentiating into specialized cell types is known
as stem cell potency. Potency defines the ability of stem cells to adopt a different
phenotype. Stem cells can be categorized by their potencies as totipotent,
pluripotent, and multipotent. [Kol13]. Totipotent stem cells are relatively rare and
initially present in low amounts in the zygote. These stem cells can differentiate
into every cell type in the body and the placenta. Pluripotent stem cells are found
in the blastocyst and can differentiate into all body cell types other than the
placenta. Multipotent stem cells are more specialized and are found in three germ
layers: the ectoderm, endoderm, and mesoderm. They differentiate into different
cell types according to the germ layer that they originate from. In contrast,
unipotent stem cells exhibit long-term self-renewal and can reproduce in large
amounts. However, these cells are committed to differentiating into one specific
cell type. [Arb23].

Mesenchymal Stem Cells

MSCs are stromal cells that exhibit multilineage differentiation and have the
ability of self- renewal, akin to other types of stem cells. MSCs can be extracted
from various tissues, including adipose tissue, bone marrow, menstrual blood,
endometrial polyps, and the umbilical cord. [Din07]. This is because these sources
are most useful for experimental and potential clinical applications because of the
ease of extraction and yield. Thus, MSCs also carry fewer ethical issues compared
to other stem cells, such as induced pluripotent stem cells and embryonic stem
cells, due to this ease in harvest. [Din11].

50
Under particular in vitro circumstances, MSCs can develop into diverse lineages
of mesodermal, ectodermal, and endodermal cells, including bone, fat,
chondrocyte, muscle, neuron, islet cells, and liver cells. [Ois09]. Additionally,
genetic processes involving transcription factors control differentiation. Some
regulatory genes that cause progenitor cells to differentiate into a particular
lineage can govern differentiation to a specific phenotypic route. [Bac18]. A
microenvironment created with biomaterial scaffolds can offer MSCs the ideal
circumstances for proliferation and differentiation in addition to growth factors
and induction chemicals. [Ser04]. Research has also found that adult human
MSCs can easily and directly be developed into dopaminergic neurons. [Kha19].
[Ven17].

2.2.2 Exosomes

Exosomes, tiny organelles surrounded by a single membrane, harbor a distinct


array of proteins, lipids, nucleic acids, and glycoconjugates. In the cell, they are
available in the nucleus and cytoplasm and take part in RNA processing. Exosomes
can be secreted by B and T cells, dendritic cells, mesenchymal stem cells, epithelial
and endothelial cells, and cancer cells. [Kal20]. Exosomes are capable of
remodeling the extracellular matrix and transmitting signals and molecules
between cells once they are released from the host cell.

Figure 6: Virtually all cells release exosomes, most commonly identified by the
tetraspanins CD9, CD81, and CD63 on their surface. Exosomes carry molecules
such as proteins, RNA, or DNA and mediate cell-to-cell communication. [Neund].

Due to their extremely small proportions, they can easily pass compartments
and membranes. This cell-to-cell interaction mediation of exosomes plays a

51
significant role in human metabolism and health, including the development of
immunity and the maintenance of homeostasis, the onset of malignancy, and the
development of numerous diseases. Viruses and other evading particles can use
these vesicle pathways to spread their infections. One remarkable attribute of
exosomes lies in their ability to be harnessed for targeted interventions and drug
delivery. Furthermore, their capability to traverse the blood-brain barrier
positions them as an excellent drug delivery pathway. [Zho23].

Exosomes also play a crucial role in paracrine signaling and are the primary
determinant of stem cell efficacy. Cell-free exosome therapy can overcome
numerous drawbacks of stem cells, such as their stability and storage convenience.
Exosomes exhibit high biocompatibility, eliminating the risk of host rejection and
enabling precise dose-control. [Gur21]. One significant feature of exosomes lies in
their ability to transfer RNA to recipient cells and affect their proteome, functions,
and RNA expression. These processes are essential for controlling immunological
responses or various other pathological reactions through intercellular
communication. [Har13]. These RNA molecules include messenger RNA (mRNA)
and microRNA (miRNA), which affect the protein synthesis of the recipient cells in
the process of cell-to-cell communication. [Xin12].

MSC-EXOs

MSCs have been found to carry the ability to differentiate into neural cells and
secrete several neurotrophic and anti-inflammatory substances after
transplantation, showing strong neuroprotective capabilities for diseases such as
amyotrophic lateral sclerosis, multiple sclerosis, PD, and glaucoma. [Joh10].

It is currently widely accepted that MSCs primarily use secreted trophic factors
in order to exert their therapeutic benefits. Exosomes are thought by many
researchers to be the paracrine effectors of MSCs with their involvement in cell-
to-cell communication. They have been tested in various illness models, and the
results have shown that they perform similar tasks to MSCs, including reducing
the size of myocardial infractions, enabling kidney injury repair, modifying
immunological responses, and encouraging tumor growth. [Yu14].

MSC-EXOs were initially studied in a mouse model of cardiac ischemia injury


in 2010, [Che10b]. and they have been subsequently examined in a number of
disease models. MSCs have been shown to produce more exosomes than other cell
lines. [Reo13]. Exosomes generated from MSCs and other sources are identical in
terms of morphological characteristics, isolation, and storage conditions. MSC-
EXOs can be identified by several adhesion molecules, such as CD29, CD44, and
CD73 expressed on the membrane of MSCs, in addition to the general exosome
surface markers CD9 and CD81. [Lai15].

52
MSC-EXOs have also been investigated with regard to their miRNAs. It

has been discovered that most of the miRNAs included in MSC-EXOs are in their
precursor form. [Che10b]. MSCs influence other cells biologically by secreting
miRNAs through these exosomes. Exosomes from MSCs administered to neurons
and astrocytes cause target cells to produce miR-133b, which aids the functional
recovery process in spinal cord injury and PD. This discovery indicates that MCSs
control neurite outgrowth by delivering miR-133b to neurons and astrocytes via
exosome release. [Xin12].

3 MSC-EXOs as a treatment for Parkinson’s disease

As explained in previous chapters, MSCs have the potential to become a valuable


therapeutic tool in the treatment of neurological disorders. To aid functional
recovery, they interact with brain parenchymal cells. It has been proposed that
MSCs and parenchymal cells communicate via miRNA found in exosomes. [Yu14].
MiRNAs are evolutionarily conserved, nonprotein coding transcripts of 18-25
nucleotides that post-transcriptionally regulate gene expression by inhibiting
translation and degrading mRNA. MiRNAs are a significant regulatory gene family
in eukaryotic cells. [Fio08]. They act as critical factors in various regulatory
systems, including host-pathogen interactions, developmental timing, stem cell
differentiation, proliferation, apoptosis, and tumorigenesis in animals. [Lim10].
MicroRNA 133b (miR- 133b) is expressed in midbrain dopaminergic neurons. It
controls the synthesis of tyrosine, hydroxylase, and dopamine transporter in
people with PD. [Dre10]. Furthermore, a study has used morpholino antisense
oligonucleotides to inhibit the expression of miR133b following spinal cord injury
in adult zebrafish and has discovered that locomotor recovery was significantly
hampered and that the decrease in miR133b expression inhibited the
regeneration of axons from neurons. [Yu11]. It has thus been concluded that in
cases of PD and spinal cord injury, miR-133b has aided functional recovery,
although its efficacy in cases of cerebral ischemia has yet to be investigated.
[Xin12]. [Li21].

A study has found that MSC treatment dramatically increased the levels of miR-
133b in the ipsilateral hemisphere of rats who had undergone middle cerebral
artery occlusion (MCAo). Exosomes from MSCs that had been exposed to
ipsilateral ischemia tissue extracts from rats that had undergone MCAo in vitro,
showed a substantial increase in miR-133b levels, significantly in primary
cultured neurons and astrocytes. However, treatment of the astrocytes with
exosome- enriched fractions from MSCs transfected with a miR-133b inhibitor
dramatically reduced miR- 133b levels. This study stands as the first evidence that

53
MSCs interact with brain parenchymal cells via exosome-mediated miR133b
transfer, controlling the expression of particular genes in order to promote neurite
outgrowth and functional treatment. [Xin12]. The research team later showed that
intravenous injection of MSC-EXOs can increase axonal density and
synaptophysin-positive areas along the ischemic boundary zone of the cortex and
striatum and hasten functional recovery in the same model as above, confirming
that MSC-EXOs could significantly improve neurologic outcome and contribute to
neurovascular remodeling. [Xin13].

Clinical research reports claim that 98 percent of medications that could be


used to treat illnesses of the central nervous system failed in clinical trials because
of their inability to cross the blood-brain barrier (BBB). [Par12]. Exosomes
typically have a diameter between 30 and 150 nm, which is significantly small,
allowing them to effortlessly pass across the BBB and reach the central nervous
system. [Kal20]. This is another beneficial characteristic of exosomes, allowing
them to be used as therapeutic signals or drug delivery vehicles due to their small
diameters, minimal immunogenicity’s, and extended circulation half-lives. [Kal14].
A study proved this, as MSC-EXOs successfully penetrated the BBB and reached
dopaminergic neurons in the substantia nigra in an experiment on rats. MSC-EXOs
reduced the apoptotic cell death of dopaminergic neurons and the asymmetric
rotation caused by apomorphine. Additionally, in the substantia nigra of MSC-
EXOs-treated rats, degenerative and necrotic alterations in the form of profoundly
eosinophilic cytoplasms, along with pyknosis and karyolysis, were not seen. The
existence of multipolar neurons with nucleoli and basophilic granular cytoplasms
in brain tissue samples of MSC-EXOstreated animals indicated considerable
improvement during histological inspection. Importantly, MSC-EXOs elevated
dopamine and its metabolites in the striatum, further indicating that MSC-EXO-
based therapy enhances dopaminergic neurons’ functionality in animals with PD.
[Che20a]. Exosomes’ ability to interact with cells under various standard and
pathological circumstances further suggests their crucial potential in the
treatment of PD. [Sma07]. Exosomes also play a role in synaptic plasticity, nerve
regeneration, and neuronal development. Exosomes deliver control elements to
nerve damage sites, promoting the production of new tissue and proteins. [dRV16].
According to a study, the autophagy triggered by exosomes causes the motor
symptoms and dopamine neurons in the substantia nigra striatum to be increased
in PD mice after exosome treatment. [Che20a]. [Liu22].

4 Discussion

Despite being preliminary in terms of clinical application, stem cells have shown
significant therapeutic potential in a variety of diseases. The biggest shortcoming

54
of stem cells is their high instability. Suffering from their high potency, they carry
the risk of tumor formation in clinical applications. Due to this characteristic, there
has been a shift of focus to utilizing their exosomes in regenerative medicine.
Exosomes derived from stem cells have been proven to carry therapeutic abilities
on par with those of stem cells. As they do not have the ability to multiply on their
own and show highly adaptive characteristics, being able to survive in a variety of
environments, they are a much stabler option than stem cells regarding clinical
application and have a high potential to be optimized in drug usage. These
nanoparticles can easily pass membranes throughout tissues thanks to their small
sizes. As mentioned, exosomes have an essential responsibility in cell-to-cell
communication and have been accepted by many studies to be behind the
therapeutic effects of stem cells by enabling their miRNA transmission. For these
reasons, there has been a spur of recent research on these microvesicles as
potential treatments for many diseases. Similarly, many scientists have been
researching the effect of exosomes on neurodegenerative diseases, particularly AD
and PD.

Compared to other exosomes, there has been a focus on MSC-EXOs in papers


regarding Parkinson’s disease. MSCs are a particularly fit choice of stem cells due
to their easy harvest and high potential of acting as a cell-based therapeutic agent
for tissue regeneration. They do not carry many ethical sourcing issues that other
stem cells carry, such as iPSCs. Another factor is the therapeutic effects that MSCs
carry on PD. Specifically, their ability to differentiate into dopaminergic neurons
and secrete neurotrophic substances has made them an exciting topic in PD
research.

The combination of the relatively newfound focus on exosomes and the


advantages that MSCs have clearly shown over other stem cells has caused an
increase in research on the therapeutic effects of MSC-EXOs on PD. Even though
studies on this topic have proved the positive effects that MSC-EXOs carry on PD,
there remains a need for more research and attention. MSC-EXOs have been
proven highly effective and accessible for treating PD in several rat models.
However, there is a lack of and need for follow-up studies that include humans and
large animal models. Furthermore, randomized controlled studies need to be
carried out to verify the therapeutic benefits of exosomes in this regard. However,
some shortcomings of exosome therapy should be discussed as well.

The practical use of exosome-based therapy is currently difficult and


constrained by a number of problems. First, the length of time needed to create a
large batch of exosomes restricts their effectiveness and clinical use potential.
Second, because exosomes include diverse bioactive components, the target tissue
may experience unexpected side effects. Third, depending on the characteristics
of the donor cells, these components may reveal the potential danger of tumor

55
growth and the impacts of immunogenicity. Finally, the therapeutic implications
of exosome formation under intervention during disease are uncertain because of
their intricate structure. Exosome-derived stem cells are currently the subject of
preliminary research, however; their pharmaceutical use is hampered by the
varied composition and functional activity of spontaneously produced exosomes.
It has been revealed that exosomes derived in different conditions have carried
other functional factors. Exosomes’ precise function is likewise primarily
unknown. [Yua18]. There remains a critical need for research on the precise role
and components of exosomes for progress in studies regarding the therapeutic
delivery of different diseases.

5 Conclusion

Due to their ease in harvest and high potential in tissue regeneration and
remodeling, MSCs have become highly anticipated stem cells, being studied in a
variety of cell therapies. They carry the potential of differentiating into several
different cell lines, including dopaminergic cells, whose deterioration is a
prominent issue in PD patients. MSCs have also been proven to secrete
neurotrophic and anti-inflammatory substances after transplantation, carrying
strong neuroprotective abilities.

Midbrain dopaminergic neurons express miR-133b, which regulates the


production of tyrosine hydroxylase and the dopamine transporter in patients with
PD. A study has shown that MSCs interact with brain parenchymal cells via
exosome-mediated miR-133b transfer in order to promote neurite outgrowth and
functional treatment. MSC-EXOs are also able to successfully cross the BBB, a
shortcoming of many drugs targeting PD, and reach dopaminergic neurons in the
substantia nigra with ease. MSC-EXO therapy on rats with PD has shown increased
dopamine and its metabolites in their striatum.

References
[Aar05] D. Aarsland. A systematic review of prevalence studies of dementia in
parkinson’s disease. Movement Disorders : Official Journal of the
Movement Disorder Society, 20, 1255–1263, 2005.
[Abu22] A. H. Abusrair. Tremor in parkinson’s disease: From pathophysiology to
advanced therapies. Tremor and Other Hyperkinetic Movements (New
York, N.Y.), 12, 29. https://doi.org/10.5334/tohm.712, 2022.

56
[Alv04] G. Alves. Cigarette smoking in parkinson’s disease: Influence on disease
progression. Movement Disorders : Official
Journal of the Movement Disorder Society, 19(9), 1087–1092.
https://doi.org/10.1002/mds.20117, 2004.
[Ant08] A. Antonini. Comt inhibition with tolcapone in the treatment algorithm
of patients with parkinson’s disease (pd): Relevance for motor and non-
motor features. Neuropsychiatric Disease and Treatment, 4(1), 1–9.
https://doi.org/10.2147/ndt.s2404, 2008.
[Arb23] D. Arbabha. Application of stem cells and adipose- derived stem cell
exosomes on dermal wound healing. CellR4, 11(e3402).
https://doi.org/10.32113/cellr4202373402,2023.
[Ari99] K. Arima. Cellular co-localization of phosphorylated tau- and
nacp/alpha-synuclein- epitopes in lewy bodies in sporadic parkinson’s
disease and in dementia with lewy bodies. Brain Research, 843(1–2),
53–61. https://doi.org/10.1016/s0006-8993(99)01848-x, 1999.
[Bac18] L. Bacakova. Stem cells: Their source, potency and use in regenerative
therapies with focus on adipose-derived stem cells—a review.
Biotechnology Advances, 36(4), 1111–1126.
https://doi.org/10.1016/j.biotechadv.2018.03.011, 2018.
[Baj10] A. Bajaj. Parkinson’s disease and cancer risk: A systematic review and
meta-analysis. Cancer Causes Control : CCC, 21(5), 697–707.
https://doi.org/10.1007/s10552-009-9497-6, 2010.
[Bar69] A. Barbeau. L-dopa therapy in parkinson’s disease: A critical review of
nine years’ experience. Canadian Medical Association Journal, 101(13),
59–68., 1969.
[Bei14] J. M. Beitz. Parkinson’s disease: A review. rontiers in Bioscience (Scholar
Edition), 6(1), 65–74. https://doi.org/10.2741/s415, 2014.
[Ber01] A. Berardelli. Pathophysiology of bradykinesia in parkinson’s disease.
Brain : A Journal of Neurology, 124(Pt 11), 2131– 2146.
https://doi.org/10.1093/brain/124.11.2131, 2001.
[Bla16] J. W. Blaszczyk. Parkinson’s disease and neurodegeneration: Gaba-
collapse hypothesis. Frontiers in Neuroscience, 10, 269.
https://doi.org/10.3389/fnins.2016.00269, 2016.
[Bol20] M. Bologna. Pathophysiology of rigidity in parkinson’s disease: Another
step forward. Clinical Neurophysiology : Official Journal of the
International Federation of Clinical Neurophysiology, 131(8), 1971–
1972. https://doi.org/10.1016/j.clinph.2020.05.013, 2020.
[Bou08] G. Bouchez. Partial recovery of dopaminergic pathway after graft of
adult mesenchymal stem cells in a rat model of parkin-

57
son’s disease. Neurochemistry International, 52(7), 1332–1342.
https://doi.org/10.1016/j.neuint.2008.02.003, 2008.
[Bre16] C. B. Breckenridge. Association between parkinson’s disease and
cigarette smoking, rural living, well-water consumption, farming and
pesticide use: Systematic review and meta-analysis. PloS One, 11(4),
e0151841. https://doi.org/10.1371/journal.pone.0151841, 2016.
[Bro00] D. J. Brooks. Dopamine agonists: Their role in the treatment of
parkinson’s disease. Journal of Neurology, Neurosurgery, and Psychiatry,
68(6), 685–689. https://doi.org/10.1136/jnnp.68.6.685, 2000.
[Car14] A. Cardinale. Protein misfolding and neurodegenerative diseases.
International Journal of Cell Biology, 2014, 217371.
https://doi.org/10.1155/2014/217371, 2014.
[Cha21] R. J. Chandler. Modelling the functional genomics of parkinson’s disease
in caenorhabditis elegans: Lrrk2 and beyond. Bioscience Reports,
41(9). https://doi.org/10.1042/BSR20203672, 2021.

[Che10a] H. Chen. Smoking duration,intensity, and risk of parkinson


 disease. Neurology, 74(11), 878–884.
https://doi.org/10.1212/WNL.0b013e3181d55f38, 2010.

[Che10b] T. S. Chen. Mesenchymal stem cell secretes microparticles enriched in


pre-micrornas. Nucleic Acids Research, 38(1), 215–224.
https://doi.org/10.1093/nar/gkp857, 2010.

[Che14] J. J. Chen. Anxiety in parkinson’s disease: Identification and management.


Therapeutic Advances in Neurological Disorders, 7(1), 52–59.
https://doi.org/10.1177/1756285613495723, 2014.
[Che20a] H. X. Chen. Exosomes derived from mesenchymal stem cells repair a
parkinson’s disease model by inducing autophagy. ell Death Disease,
11(4), 288. https://doi.org/10.1038/s41419-020-2473-5, 2020.
[Che20b] Z. Chen. Autonomic dysfunction in parkinson’s disease: Implications for
pathophysiology, diagnosis, and treatment. Neurobiology of Disease, 134,
104700. https://doi.org/10.1016/j.nbd.2019.104700, 2020.
[Cou12] P. G. Coune. Parkinson’s disease: Gene therapies. Cold Spring Harbor
Perspectives in Medicine, 2(4), a009431.
https://doi.org/10.1101/cshperspect.a009431, 2012.
[Din07] D. C. Ding. The role of endothelial progenitor cells in ischemic cerebral
and heart diseases. Cell Transplantation, 16(3), 273–284.
https://doi.org/10.3727/000000007783464777, 2007.
[Din11] D. C. Ding. Mesenchymal stem cells. Cell Transplantation, 20(1), 5–14.
https://doi.org/10.3727/096368910X, 2011.
[Dre10] J. L. Dreyer. New insights into the roles of micrornas in

58
drug addiction and neuroplasticity. Genome Medicine, 2(12), 92.
https://doi.org/10.1186/gm213, 2010.
[dRV16] J. P. de Rivero Vaccari. Exosome-mediated inflammasome signaling after
central nervous system injury. Journal of Neurochemistry, 136 Suppl 1(0
1), 39–48. https://doi.org/10.1111/jnc.13036, 2016.
[Dur04] F. Durif. Clozapine improves dyskinesias in parkinson disease: A
double- blind, placebo-controlled study. Neurology, 62(3), 381–388.
https://doi.org/10.1212/01.wnl.0000110317.52453.6c, 2004.
[Elb02] A. Elbaz. Risk tables for parkinsonism and parkinson’s disease. Journal
of Clinical Epidemiology, 55(1), 25–31. https://doi.org/10.1016/s0895-
4356(01)00425-5, 2002.
[Elb16] A. Elbaz. Epidemiology of parkinson’s disease. Revue Neurologique,
172(1), 14–26. https://doi.org/10.1016/j.neurol.2015.09.012, 2016.
[Fio08] R. Fiore. Microrna function in neuronal development, plasticity and
disease. Biochimica et Biophysica Acta, 1779(8), 471–478.
https://doi.org/10.1016/j.bbagrm.2007.12.006, 2008.
[Gal05] J. P. Galanaud. Cigarette smoking and parkinson’s disease: A case-
control study in a population characterized by a high prevalence of
pesticide exposure. Movement Disorders : Official Journal of the
Movement Disorder Society, 20(2), 181–189.
https://doi.org/10.1002/mds.20307, 2005.
[Ger89] D. C. German. Midbrain dopaminergic cell loss in parkinson’s disease:
Computer visualization. Annals of Neurology, 26(4), 507–514.
https://doi.org/10.1002/ana.410260403, 1989.
[Gur21] S. Gurunathan. A comprehensive review on factors influences
biogenesis, functions, therapeutic and clinical implications of
exosomes. International Journal of Nanomedicine, 16, 1281–1312.
https://doi.org/10.2147/IJN.S291956, 2021.
[Han17] H. A. Hanagasi. Dementia in parkinson’s disease. Journal of the
Neurological Sciences, 374, 26–31.
https://doi.org/10.1016/j.jns.2017.01.012, 2017.
[Har13] C. V. Harding. Exosomes: Looking back three decades and into the
future. The Journal of Cell Biology, 200(4), 367–371.
https://doi.org/10.1083/jcb.201212113, 2013.
[Hau99] R. A. Hauser. Long-term evaluation of bilateral fetal nigral
transplantation in parkinson disease. Archives of Neurology, 56(2), 179–
187. https://doi.org/10.1001/archneur.56.2.179, 1999.
[Hep16] D. H. Hepp. Distribution and load of amyloid- pathology in parkinson
disease and dementia with lewy bodies. Journal

59
of Neuropathology and Experimental Neurology, 75(10), 936–945.
https://doi.org/10.1093/jnen/nlw070, 2016.
[Her02] M. A. Herm´an. A meta-analysis of coffee drinking, cigarette smoking,
and the risk of parkinson’s disease. Annals of Neurology, 52(3), 276–284.
https://doi.org/10.1002/ana.10277, 2002.
[Her16] T. M. Herrington. Mechanisms of deepbrain
 stimulation. Journal of Neurophysiology, 115(1), 19–
38. https://doi.org/10.1152/jn.00281.2015, 2016.
[Huo17] P. Huot. Serotonergic approaches in parkinson’s disease: Translational
perspectives, an update. ACS Chemical Neuroscience, 8(5), 973–986.
https://doi.org/10.1021/acschemneuro.6b00440, 2017.
[Hur13] M. J. Hurley. Parkinson’s disease is associated with altered expression
of cav1 channels and calcium-binding proteins. Brain : A Journal of
Neurology, 136(Pt 7), 2077–2097.
https://doi.org/10.1093/brain/awt134, 2013.
[Iba17] C. F. Ibanez. Biology of gdnf and its receptors—relevance for disorders
of the central nervous system. Neurobiology of Disease, 97(Pt B), 80–89.
https://doi.org/10.1016/j.nbd.2016.01.021, 2017.
[Jan08] J. Jankovic. Clinical features and diagnosis. ournal of Neurology,
Neurosurgery, and Psychiatry, 79(4), 368–376.
https://doi.org/10.1136/jnnp.2007.131045, 2008.
[Joh10] T. V. Johnson. Neuroprotective effects of intravitreal mesenchymal stem
cell transplantation in experimental glaucoma. Investigative
Ophthalmology Visual Science, 51(4), 2051–2059.
https://doi.org/10.1167/iovs.09-4509, 2010.
[Kal14] A. Kalani. Exosomes: Mediators of neurodegeneration, neuroprotection
and therapeutics. Molecular Neurobiology, 49(1), 590–600.
https://doi.org/10.1007/s12035-013-8544-1, 2014.
[Kal20] R. Kalluri. The biology, function, and biomedical applications of
exosomes. Science (New York, N.Y.), 367(6478).
https://doi.org/10.1126/science.aau6977, 2020.
[Kha19] M. Khademizadeh. Differentiation of adult human mesenchymal stem
cells into dopaminergic neurons. Research in Pharmaceutical Sciences,
14(3), 209–215. https://doi.org/10.4103/1735- 5362.258487, 2019.
[Kim11] H. J. Kim. Stem cell potential in parkinson’s disease and molecular
factors for the generation of dopamine neurons. Biochimica et
Biophysica Acta, 1812(1), 1–11.
https://doi.org/10.1016/j.bbadis.2010.08.006, 2011.
[Kim18] S. M. Kim. Gait patterns in parkinson’s disease with or without

60
cognitive impairment. Dementia and Neurocognitive Disorders, 17(2),
57–65. https://doi.org/10.12779/dnd.2018.17.2.57, 2018.
[Kol13] G. Kolios. Introduction to stem cells and regenerative medicine.
Respiration; International Review of Thoracic Diseases, 85(1), 3–10.
https://doi.org/10.1159/000345615, 2013.
[Lai15] R. C. Lai. Mesenchymal stemcell exosomes. Seminars in
Cell Developmental Biology, 40, 82–88.
https://doi.org/10.1016/j.semcdb.2015.03.001, 2015.
[lee88] A. J. lees. The nighttime problems of parkinson’s disease. Clinical
Neuropharmacology, 11(6), 512–519.
https://doi.org/10.1097/00002826- 198812000-00004, 1988.
[Li21] Q. Li. Exosomes derived from mir-188-3p- modified adiposederived
mesenchymal stem cells protect parkinson’s disease. Molecular
Therapy. Nucleic Acids, 23, 1334–1344.
https://doi.org/10.1016/j.omtn.2021.01.022, 2021.
[Lim10] P. K. Lim. Neurogenesis: Role for micrornas and mesenchymal stem cells
in pathological states. Current Medicinal Chemistry, 17(20), 2159–2167.
https://doi.org/10.2174/092986710791299894, 2010.
[Lin11] O. Lindvall. Cell therapeutics in parkinson’s disease. Neurotherapeutics :
The Journal of the American Society for Experimental
NeuroTherapeutics, 8(4), 539–548. https://doi.org/10.1007/s13311-
0110069-6, 2011.
[Liu15] A. K. L. Liu. Nucleus basalis of meynert revisited: Anatomy, history and
differential involvement in alzheimer’s and parkinson’s disease. Acta
Neuropathologica, 129(4), 527–540. https://doi.org/10.1007/s00401-
015-1392-5, 2015.
[Liu22] S. F. Liu. Update on the application of mesenchymal stem cell-derived
exosomes in the treatment of parkinson’s disease: A systematic review.
Frontiers in Neurology, 13, 950715.
https://doi.org/10.3389/fneur.2022.950715, 2022.
[Mai17] P. Maiti. Current understanding of the molecular mechanisms in
parkinson’s disease: Targets for potential treatments. Translational
Neurodegeneration, 6, 28. https://doi.org/10.1186/s40035-017-0099z,
2017.
[Mar12] O. Marques. Alpha-synuclein: From secretion to dysfunction and death.
Cell Death Disease, 3(7), e350. https://doi.org/10.1038/cddis.2012.94,
2012.
[Neund] Neuromics. Exosome isolation methods.
https://www.neuromics.com/exosome- isolation-methods, n.d.

61

[niand] nia.nih.gov. What is lewy body dementia? causes, symptoms, and
treatments. https://www.nia.nih.gov/health/what-lewybody-dementia-
causes-symptoms-and- treatments, n.d.
[Off07] D. Offen. Intrastriatal transplantation of mouse bone marrow-derived
stem cells improves motor behavior in a mouse model of parkinson’s
disease. Journal of Neural Transmission. Supplementum, 72, 133–143.
https://doi.org/10.1007/978-3-211-73574-916,2007.
[Oh16] S. H. Oh. Mesenchymal stem cells inhibit transmission of -synuclein by
modulating clathrin-mediated endocytosis in a parkinsonian model.
Cell Reports, 14(4), 835–849.
https://doi.org/10.1016/j.celrep.2015.12.075, 2016.
[Ois09]
K. Oishi. Differential ability of somatic stem cells. Cell Transplantation,
18(5), 581–589. https://doi.org/10.1177/096368970901805614, 2009.

[Oku12] M. S. Okun. Deep-brain stimulation for parkinson’s disease. The New


England Journal of Medicine, 367(16), 1529–1538.
https://doi.org/10.1056/NEJMct1208070, 2012.
[Par12] W. M. Pardridge. Drug transport across the blood-brain barrier. Journal
of Cerebral Blood Flow and Metabolism : Official Journal of the
International Society of Cerebral Blood Flow and Metabolism, 32(11),
1959–1972. https://doi.org/10.1038/jcbfm.2012.126, 2012.
[Pen10] S. Pennington. The cause of death in idiopathic parkinson’s disease.
Parkinsonism Related Disorders, 16(7), 434–437.
https://doi.org/10.1016/j.parkreldis.2010.04.010, 2010.
[Pfe16] R. F. Pfeiffer. Non-motor symptoms in parkinson’s disease. Parkinsonism
Related Disorders, 22 Suppl 1, S119-122.
https://doi.org/10.1016/j.parkreldis.2015.09.004, 2016.
[Pos11] I. J. Posada. Mortality from parkinson’s disease: A populationbased
prospective study (nedices). Movement Disorders : Official Journal of the
Movement Disorder Society, 26(14), 2522–2529.
https://doi.org/10.1002/mds.23921, 2011.
[Pri14] T. Pringsheim. The prevalence of parkinson’s disease: A systematic
review and meta-analysis. Movement Disorders : Official Journal of the
Movement Disorder Society, 29(13), 1583–1590.
https://doi.org/10.1002/mds.25945, 2014.
[Reo13] R. W. Y. Reo. Mesenchymal stem cell: An efficient mass producer of
exosomes for drug delivery. Advanced Drug Delivery Reviews, 65(3),
336–341. https://doi.org/10.1016/j.addr.2012.07.001, 2013.

62

[Rie04] P. Riederer. Clinical applications of mao-inhibitors. Current Medicinal
Chemistry, 11(15), 2033–2043.
https://doi.org/10.2174/0929867043364775, 2004.
[Rit07] B. Ritz.Pooled analysis of tobacco use and risk of
parkinson disease. Archives of Neurology, 64(7), 990–
997. https://doi.org/10.1001/archneur.64.7.990, 2007.
[Roy04] L. Roybon. Stem cell therapy for parkinson’s disease: Where do we
stand? Cell and Tissue Research, 318(1), 261–273.
https://doi.org/10.1007/s00441-004-0946-y, 2004.
[Ser04] M. Seruya. Clonal population of adult stem cells: Life span and
differentiation potential. Cell Transplantation, 13(2), 93–101.
https://doi.org/10.3727/000000004773301762, 2004.
[Sez19] M. Sezgin. Parkinson’s disease dementia and lewy body disease.
Seminars in Neurology, 39(2), 274–282. https://doi.org/10.1055/s-
00391678579, 2019.
[SM08] S. Schraen-Maschke. Tau as a biomarker of neurodegenerative diseases.
Biomarkers in Medicine, 2(4), 363–384.
https://doi.org/10.2217/17520363.2.4.363, 2008.
[Sma07] N. R. Smalheiser. Exosomal transfer of proteins and rnas at synapses in
the nervous system. Biology Direct, 2, 35. https://doi.org/10.1186/1745-
6150-2-35, 2007.
[Ste12] L. Stefanis. -synuclein in parkinson’s disease. Cold Spring
Harbor Perspectives in Medicine, 2(2), a009399.
https://doi.org/10.1101/cshperspect.a009399, 2012.
[Ste20] A. Stefani. Sleep in parkinson’s disease. Neuropsychopharmacology :
Official Publication of the American College of
Neuropsychopharmacology, 45(1), 121–128.
https://doi.org/10.1038/s41386-019-0448-y, 2020.
[Sub13] S. R. Subramaniam. Mitochondrial dysfunction and oxidative stress in
parkinson’s disease. Progress in Neurobiology, 106–107, 17–32.
https://doi.org/10.1016/j.pneurobio.2013.04.004, 2013.
[S¨a08] K. S¨a¨aksj¨arvi. Prospective study of coffee consumption and risk of
parkinson’s disease. European Journal of Clinical Nutrition, 62(7), 908–
915. https://doi.org/10.1038/sj.ejcn.1602788, 2008.
[Tan99] C. M. Tanner. Parkinson disease in twins: An etiologic study. JAMA,
281(4), 341–346. https://doi.org/10.1001/jama.281.4.341, 1999.

63
[Tar17] A. Tarakad. Anosmia and ageusia in parkinson’s disease.
 International Review of Neurobiology, 133, 541–556.
https://doi.org/10.1016/bs.irn.2017.05.028, 2017.
[Tha08] E. L. Thacker. Familial aggregation of parkinson’s disease: A
metaanalysis. Movement Disorders : Official Journal of the Movement
Disorder Society, 23(8), 1174–1183.
https://doi.org/10.1002/mds.22067, 2008.
[Tys17] O. B. Tysnes. Epidemiology of parkinson’s disease. Journal of Neural
Transmission (Vienna, Austria : 1996), 124(8), 901–905.
https://doi.org/10.1007/s00702- 017-1686-y, 2017.
[vdM12] M. van der Mark. Is pesticide use related to parkinson disease? some
clues to heterogeneity in study results. Environmental Health
Perspectives, 120(3), 340–347. https://doi.org/10.1289/ehp.1103881,
2012.
[Ven17] K. Venkatesh. Mesenchymal stem cells as a source of dopaminergic
neurons: A potential cell based therapy for parkinson’s disease. Current
Stem Cell Research Therapy, 12(4), 326–347.
https://doi.org/10.2174/1574888X12666161114122059, 2017.
[Ver15] A. Verstraeten. Progress in unraveling the genetic etiology of parkinson
disease in a genomic era. Trends in Genetics : TIG, 31(3),
140–149. https://doi.org/10.1016/j.tig.2015.01.004, 2015.
[Wei22] D. Weintraub. The neuropsychiatry of parkinson’s disease: Ad-
vances and challenges. The Lancet. Neurology, 21(1), 89–102.
https://doi.org/10.1016/S1474- 4422(21)00330-6, 2022.

[Wir11a] K. Wirdefeldt. Epidemiology and etiology of parkinson’s disease: A


review of the evidence. European Journal of Epidemiology, 26 Suppl 1, S1-
58. https://doi.org/10.1007/s10654-011-9581-6, 2011.

[Wir11b] K. Wirdefeldt. Heritability of parkinson disease in swedish twins: A


longitudinal study. Neurobiology of Aging, 32(10), 1923.e1-8.
https://doi.org/10.1016/j.neurobiolaging.2011.02.017, 2011.
[Woo04] G. F. Wooten. Are men at greater risk for parkinson’s disease than women?
Journal of Neurology, Neurosurgery, and Psychiatry, 75(4), 637–639.
https://doi.org/10.1136/jnnp.2003.020982, 2004.
[Xin12] H. Xin. Exosome-mediated transfer of mir-133b from multipotent
mesenchymal stromal cells to neural cells contributes to neurite
outgrowth. Stem Cells (Dayton, Ohio), 30(7), 1556–1564.
https://doi.org/10.1002/stem.1129, 2012.
[Xin13] H. Xin. Systemic administration of exosomes released from
mesenchymal stromal cells promote functional recovery and
neurovascular plasticity after stroke in rats. Journal of Cerebral Blood

64
Flow and Metabolism : Official Journal of the International Society of
Cerebral Blood Flow and Metabolism, 33(11), 1711–1715.
https://doi.org/10.1038/jcbfm.2013.152, 2013.
[Xu14] L. Xu. An emerging role of park2 in cancer. Journal of Molecular Medicine
(Berlin, Germany), 92(1), 31–42. https://doi.org/10.1007/s00109- 013-
1107-0, 2014.
[Yu11] Y. M. Yu. Microrna mir-133b is essential for functional recovery after
spinal cord injury in adult zebrafish. The European Journal of
Neuroscience, 33(9), 1587–1597.
https://doi.org/10.1111/j.14609568.2011.07643.x, 2011.
[Yu14] B. Yu. Exosomes derived from mesenchymal stem cells. International
Journal of Molecular Sciences, 15(3), 4142–4157.
https://doi.org/10.3390/ijms15034142, 2014.
[Yua18] Y. Yuan. Stem cell-derived exosome in cardiovascular diseases: Macro
roles of micro particles. Frontiers in Pharmacology, 9, 547.
https://doi.org/10.3389/fphar.2018.00547, 2018.
[Zha23] K. Zhang. Stem cell-derived exosome versus stem cell therapy. Nature
Reviews Bioengineering, 1–2. https://doi.org/10.1038/s44222023-
00064-2, 2023.
[Zho23] Z. Zhou. Implications of crosstalk between exosome-mediated
ferroptosis and diseases for pathogenesis and treatment. Cells, 12(2).
https://doi.org/10.3390/cells12020311, 2023.
[Zhu16] M. Zhu. Sensory symptoms in parkinson’s disease: Clinical features,
pathophysiology, and treatment. Journal of Neuroscience Research, 94(8),
685–692. https://doi.org/10.1002/jnr.23729, 2016.

65
Life Challenges Faced By Chinese Workers In
Africa
Hengyi Chen∗
October 13, 2023

Abstract
The presence of Chinese workers in Africa has grown significantly in
recent years, driven by large-scale infrastructure projects and economic
initiatives. While attention has been devoted to the broader landscape
of Chinese foreign enterprises operating abroad, there is a notable lack
of comprehensive research into the specific challenges faced by Chinese
workers in Africa. This essay delves into the multifaceted challenges ex-
perienced by Chinese workers in Africa, focusing on personal safety, health
concerns, and cultural conflicts. This essay proposes several potential so-
lutions to address these challenges. Companies can incorporate crucial
provisions in employment contracts to ensure the welfare and quality of
life of their workers. Educational sessions for Chinese workers in Africa
can help bridge cultural gaps and enhance safety. Additionally, business
organizations such as chambers of commerce can play a collaborative role
in addressing these challenges and mediating conflicts.

1 Introduction
Chinese foreign enterprises operating abroad have certainly garnered substantial
attention from researchers and the media. This reporting typically focuses on
huge, state-owned companies and their strategic impact.1
Chinese state-owned businesses often engage in large-scale infrastructure
projects, resource extraction, and other ventures that have the potential to
shape the economic landscape of host countries. To exemplify, China has in-
vested in Ethiopia’s railway system. The Chinese Export-Import Bank provided
85% of the funding for the $475 million Addis Ababa Light Rail, which serves
4 million of the city’s residents.2
In contrast to the focus on investment, it is surprising that researchers have
published relatively few essays about the circumstances of Chinese workers in
∗ Advised
by: Dr. James Sundquist of the University of Yale
1 Forexample, see Haydn Shaughnessy, “Chinese Companies Are Transforming Busi-
ness—and the West Is Struggling To Keep Up”
2 Mariama Sow, “Figures of the week: Chinese investment in Africa”

66
Africa. Given the increasing presence of Chinese labor in various African coun-
tries under the influence of transnational projects, such as the Belt and Road
Initiative, Chinese workers have become a huge labor sector abroad. As of 2019,
there were officially about one million Chinese workers employed overseas, with
many additional Chinese citizens working overseas on tourist visas or in other
unofficial capacities.3
When attention is paid to labor issues in Africa, it usually frames Chinese
labor as competing with Africans for jobs or focuses on African workers under
Chinese managers. For example, U.S. politicians, from Hilary Clinton to Rex
Tillerson, have criticized China for not hiring enough African workers.4 Al-
though these angles are worth exploring, so are the challenges faced by Chinese
workers. The U.S. perspective still portrays China as a single entity, without
sympathy for the difficult situations many Chinese workers find themselves in.
One of the rare instances in which Chinese individuals are placed in center
stage is Howard French’s work, China’s Second Continent. French introduces
the pattern of the migration waves of Chinese workers. The book captures these
individuals’ motivations, actions, and experiences in Africa.5 French provides
numerous interesting anecdotes about Chinese entrepreneurs and the opportu-
nities they see in China. The Asian arrivals have also faced substantial hurdles
due to language barriers in communicating with local administrations. French’s
book, on the other hand, is full of examples of persistent Chinese migrants who
have successfully explored possibilities in this new frontier.6 French highlights
the spirit of these individuals and their ability to identify and capitalize on op-
portunities in Africa’s rapidly evolving economic landscape. He discusses the
sectors they invest in, such as infrastructure, manufacturing, and retail, and
how their activities impact local economies.
French’s account, while illuminating, contains three important gaps. My
opinions diverge from French in these three aspects. First, French’s book pri-
marily focuses on entrepreneurs and does not explore common workers in the
same level of detail. Chinese workers in Africa face a wide range of challenges
that differ significantly from those of entrepreneurs. These challenges may in-
clude issues related to labor rights, working conditions, cultural adaptation,
discrimination, and more. Neglecting to explore these aspects leaves a signifi-
cant gap in the narrative, as the experiences of workers are a vital part of the
broader story of Chinese engagement in Africa. Second, his use of anecdotes
shows the diversity of experiences but makes it challenging to understand the
common themes of Chinese experiences in Africa. Anecdotes by their nature
are selective and specific. They highlight individual stories or instances, but
these may not be representative of the broader population. If the book relies
heavily on anecdotes, it can give readers a skewed view of the overall experi-
3 Jennifer Hillman and Alex Tippett, “Who Built That? Labor and the Belt and Road
Initiative”
4 Jenni Marsh, “Employed By China”
5 IPI, “Africa: China’s Second Continent”
6 Chris Hartman, ’China’s Second Continent’ tells the fascinating yet alarming story of

China’s economic colonization of Africa

67
ences of Chinese workers. It may focus on exceptional cases or outliers, making
it difficult to discern the typical challenges faced by the majority. Finally, ten
years have passed since French’s book was published, which suggests that it
requires updating. As the years have progressed since the book’s publication,
the landscape of China-Africa relations has inevitably evolved. Trends in la-
bor migration, working conditions, and interactions between Chinese workers
and the local populace may have transformed, thereby necessitating a refreshed
assessment of the circumstances.
In this essay, I will introduce the core challenges that are faced by Chinese
workers in Africa— personal safety, staying healthy, and cultural conflict. I show
that these three factors are the most concerning factors that bother Chinese
workers. African food systems often suffer from inefficiencies, poor infrastruc-
ture, and post-harvest losses. These issues can result in irregular and insufficient
food supplies, leading to food insecurity for Chinese workers. Inadequate food
safety measures can expose Chinese workers to health risks. Contaminated or
unsafe food can cause illnesses and undermine their well-being and Chinese
workers may face substandard housing conditions, including overcrowding and
lack of basic amenities. I argue the potential solutions to solve their living prob-
lems, such as the improvement of employment contracts, education sessions for
Chinese workers, and communication regularly with local governments to help
both sides resolve problems.

2 Personal Safety
What is the primary challenge initially encountered by Chinese workers in
Africa? While research delves into various dimensions, a focal point emerges
regarding safety concerns as one of the most central themes. The Chinese
Academy of Social Sciences notes that 84% of China’s Belt and Road invest-
ments are in medium to high-risk countries. Three hundred and fifty serious
security incidents involving Chinese firms occurred between 2015 and 2017,
from kidnappings and terror attacks to anti-Chinese violence, according to
China’s Ministry of State Security.7 Ongoing conflicts, political uncertainties,
and economic trials in certain African nations have cultivated an atmosphere
characterized by escalated and unforeseeable risks. Kate Bartlett, a journalist
based in Africa, reported that the killing of nine Chinese gold mine workers in
conflict-ridden Central African Republic in March 2023, highlighted the risks
some projects face in volatile areas.8 Against this complex backdrop, Chinese
workers, frequently engaged in critical endeavors such as infrastructure devel-
opment and resource extraction, have encountered a multitude of challenges
directly tied to safety apprehensions, labor disagreements, and interruptions
to their undertakings stemming from local upheavals. A report last year by
the U.K.-based Business and Human Rights Resource Center found 181 human
rights allegations connected to Chinese investments in Africa between 2013 and
7 Paul Nantulya, “Chinese Security Firms Spread along the African Belt and Road”
8 Kate Bartlett, “How Chinese Private Security Companies in Africa Differ From Russia’s”

68
2020, with the highest number of incidents in Uganda, Kenya, Zimbabwe, and
the Democratic Republic of Congo.9
These challenges stem from a confluence of factors. The precarious nature of
the political climate in some regions not only jeopardizes the stability of existing
projects but also undermines the overall security of Chinese workers. Moreover,
the intricate web of economic hardships prevailing in certain African countries
can exacerbate the challenges faced by Chinese workers. Financial constraints
and resource limitations in these regions can impede the timely execution of
projects, thereby increasing the exposure of workers to potential hazards and
uncertainties. In contrast, most Chinese companies just meet the basic stan-
dards of local laws, in contrast to some Western transnational enterprises and
mature local companies.10
Military takeovers are a further source of instability and danger. Accord-
ing to Cobus van Staden, senior researcher at the South African Institute of
International Affairs, “One of the contributing factors in all of this is the per-
ception you see in African countries that Chinese people keep lots of cash on
hand,” making Chinese workers favored targets for kidnappers.11 Chad, Mali,
Guinea, Sudan, and more recently Burkina Faso have all witnessed successful
military takeovers, The aftermath of these political shifts has heightened con-
cerns about the safety and security of Chinese workers operating within these
regions.12 These abrupt leadership changes, often accompanied by civil unrest
and power struggles, introduce an added layer of complexity to an already intri-
cate situation. After military takeovers, institutions and governance structures
can falter, leading to instability. This can result in higher lawlessness, weakened
law enforcement, and increased criminal activities, all contributing to jeopar-
dized safety for foreign workers, including the Chinese workforce. According to
the United Nations, in 2008, xenophobic violence resulted in the death of over
60 people and contributed to the displacement of at least 100,000.13 Although
there is an estimate that more than 500,000 Chinese citizens live in South Africa,
violent crime cases are increasingly high which cast double over their lives.14
Wang Wei, who has been employed at a Chinese company in Johannesburg
for half a decade, emphasized the heightened caution he has exercised when
venturing outside, particularly in the aftermath of the tragic murder of Zhong
Zhiwei. Zhong Zhiwei, the former president of the Township Association of
Shandong Province in South Africa, and his wife were tragically gunned down
in broad daylight in Johannesburg on August 13. Wang Wei is among the par-
ticipants in a collective effort led by a local Chinese association, urging the
South African government to swiftly apprehend and bring to justice the perpe-
trators responsible for the couple’s killing. “This is the least I can contribute,”
9 Kate Bartlett, “Are Rights Abuses Tarnishing China’s Image in Africa?”
10 Wenjie Zhao, “Research on the rights and interests protection of local workers in Chinese-
funded enterprises in Africa – taking Zimbabwe as an example”
11 Kate Bartlett, “Chinese Working in Africa Face Threat of Kidnapping”
12 Reuters, “Recent coups in West and Central Africa”
13 United Nation, “South Africa: UN experts condemn xenophobic violence and racial dis-

crimination against foreign nationals”


14 Ufrieda Ho, “Chinese in South Africa learn to live with violence”

69
he remarked.15
Moreover, the prevalence of civil conflicts between local governments and var-
ious military factions further amplifies concerns. This prompts many enterprises
to enlist security guards to safeguard both their assets and their workforce. Ac-
cording to the Private Security Industry Regulatory Authority (PRISA), there
were 9,539 registered security companies in South Africa in 2022. There are
also almost 2.5 million registered security officers.16 The need for security guards
to protect both properties and employees becomes a strategic necessity in an
environment characterized by uncertainty and potential risks. Phoenix Inter-
national, a Chinese think tank with strong state-owned enterprise (SOE) ties,
reports that no more than twenty of these state-owned firms conduct activities
overseas protecting SOEs and other Chinese interests. By 2013, they employed
around 3,200 personnel, according to the Germany-based Mercator Institute
for China Studies, more than the number of United Nations (UN) peacekeepers
China furnishes, a figure that stood at 2,534 troops and police as of June 2020.17
In the pursuit of profit maximization, certain enterprises may opt to cut
costs, particularly in areas such as safety measures and security provisions,
inadvertently putting the lives and well-being of their workers at risk. Instances
of accidents, injuries, or even loss of life can have profound effects on workforce
morale, productivity, and reputation. The negative impact on the enterprise’s
image, both locally and internationally, can overshadow any initial cost savings,
leading to the potential loss of business opportunities and investor confidence.

3 Staying Healthy
Chinese laborers are now more prevalent than ever in Africa’s dynamic panorama
of economic development and infrastructure projects. Beyond the difficulties
posed by safety concerns and operational complexities, a crucial but sometimes
under-appreciated worry remains large: the health risks that these workers en-
counter while engaged in their profession. These individuals are exposed to a
variety of health risks, such as infectious diseases and environmental contami-
nants, as they work on various projects across the continent. The multifaceted
health risks that Chinese workers in Africa face are examined in this essay, along
with the wider implications for project continuity, labor force sustainability, and
the pressing need for coordinated efforts to protect their safety.
Chinese immigrants frequently experience a difference in food culture when
arriving in African nations. This difference can be found in taste, ingredients,
and cooking techniques. Their typical eating habits may greatly differ from the
local food, which could cause discomfort and even intestinal problems. Cross-
cultural interaction includes learning new tastes and nutritional practices, but
if handled carelessly, it might pose acute health hazards. Finding familiar meals
15 GlobalTimes, “Chinese in S. Africa fear for safety amid rising murder cases”
16 Wise Move, “Top Security Companies in South Africa — Complete List 2023”
17 Paul Nantulya, “Chinese Security Contractors in Africa”

70
and maintaining dietary habits can be difficult, leading to physical and mental
exhaustion and impacting employee happiness and performance.
Beyond only individual preferences, Africa’s larger food systems have signif-
icant flaws and food safety issues. The ability of these systems to supply the
rising food demand of expanding populations has been put under strain for a
number of reasons, including extreme weather events, climate change, frequent
outbreaks of pests and diseases, and limited adoption of modern agricultural
technologies.18 This can lead to worries regarding the accessibility, efficacy, and
security of food for Chinese workers as well as the surrounding community.
For example, Africa’s food security challenges are compounded by the war in
Ukraine, by supply chain shortages, conflict, and drought. This has caused many
staple food prices in Africa to increase by an average of almost 25% between
2020 and 2022.19
The interplay of these factors can potentially lead to compromised food
safety. Inadequate food storage, lack of access to clean water for cooking and
washing produce, and challenges in sourcing ingredients that meet both dietary
and safety criteria can all contribute to health risks. Illnesses arising from food-
borne pathogens can disrupt work schedules, impede productivity, and even re-
sult in long-term health issues for workers. More importantly, the shortcomings
in the healthcare infrastructure and the vulnerability of the healthcare systems
are obvious issues that have a serious influence on the safety and well-being of
Chinese employees. When employees become unwell, the lack of hospitals and
medical services across different locations presents a difficult obstacle, empha-
sizing the vulnerabilities they face in their quest for professional prospects.
For those seeking medical care, the lack of hospitals and other medical facil-
ities—which is sometimes exacerbated by a shortage of medical personnel and
resources—creates a catastrophic scenario. Like locals, Chinese laborers en-
counter a dearth of easily accessible healthcare services. The travel time to the
closest medical institution can be rather long, and in rural regions, this might
result in delays in receiving life-saving medical care. The inadequate healthcare
systems in many African nations further exacerbate the situation. The effective-
ness of healthcare supply is compromised by a lack of resources, poor staffing,
and restricted access to necessary drugs and treatments. Chinese workers may
see a sharp contrast when navigating Africa’s healthcare system because they
are used to more developed healthcare systems. When they become ill, the chal-
lenges of communication gaps, new medical procedures, and disparate standards
of care may make an already difficult situation even worse.
In addition to physical illnesses, mental health conditions pose significant
challenges in Africa. It is worth noting that in certain African countries, men-
tal illnesses are sometimes attributed to spiritual causes, resulting in limited
attention and resources being allocated to address them. For these workers,
grappling with illness while navigating a foreign healthcare system can be emo-
tionally and physically taxing. The fear of misdiagnosis, inadequate treatment,
18 WHO, Food Safety
19 DanielleResnick and Aloysius Uche Ordu, “Africa’s food security challenge”

71
or lack of proper medical attention can intensify their anxieties. Through men-
tal health questionnaires and binary logistic regression models from the Chinese
Center for Disease Control and Prevention, among 154 employees, 48.70% had
mental health problems.20 The sense of homesickness is a common emotional re-
sponse among Chinese workers in Africa. Being separated from their homeland,
families, and the familiar cultural and social context can be emotionally taxing.
They may yearn for the comforts of home, miss important family milestones,
and feel disconnected from their roots. This homesickness can lead to feelings
of sadness, anxiety, and a sense of isolation. Chinese workers often find them-
selves in environments where the local culture and language are different from
their own. Language barriers can hinder effective communication and limit their
ability to form meaningful relationships with locals. This can contribute to a
pervasive sense of loneliness and social isolation, making it difficult for them to
integrate into the local society and establish a support network. The demands
of their work, which often involve long hours and high-pressure responsibilities,
can further exacerbate feelings of loneliness and homesickness. The absence of a
strong social support system can make it challenging to cope with the stressors
inherent to their professional roles.
The issue of an excessive workload coupled with limited leisure time has
emerged as a recurring concern, evident in numerous research papers and sur-
vey questionnaires. Chinese workers find their rights not being adequately safe-
guarded, both by the People’s Republic of China’s labor laws and the local
regulations of the host African countries. This deficiency in legal protection can
result in a situation where workers face challenges in asserting their rights and
achieving a work-life balance that promotes their well-being.
Finally, the health of Chinese workers in Africa is at risk from potential
exposure to various diseases, notably Ebola and malaria, which pose significant
health risks within certain regions. Ebola, a highly contagious and frequently
lethal disease, has caused alarm in several regions of Africa. Quarantines, re-
strictions on movement, and increased fear among workers can result from out-
breaks. Employers and employees must bear higher costs as a result of the
necessity to establish stringent health standards and safety measures to stop
the spread of Ebola. Another major threat in many African nations is malaria,
a disease that is frequently transmitted by mosquitoes. Chinese employees are
particularly vulnerable to infection because they are oblivious to the disease’s
patterns of transmission and defenses. Malaria can have a major negative impact
on health and productivity by increasing absenteeism. The prevalence of the
illness highlights the significance of preventative measures including insecticide-
treated bed nets, anti-malarial drugs, and appropriate sanitation to safeguard
employees’ health. Using a cross-sectional approach, ninety-six (37.5%) partic-
ipants contracted malaria more than once within a year.21
20 Shuo Chen, Mingfan Pang, Xiaopeng Qi, Lili Wang, Xiaochun Wang, “Analysis of mental
health status and influencing factors of employees in Chinese-funded enterprises in Ethiopia”
21 Li Zou, Ke Ning, Wenyu Deng, Xufei Zhang, Mohamamad Shahir Sharifi, Junfei Luo, Yin

Bai, Xiner Wang, Wenjuan Zhou, “Study on the use and effectiveness of malaria preventive
measures reported by employees of chinese construction companies in Western Africa in 2021”

72
4 Culture Conflicts
Chinese workers in Africa often encounter significant cultural barriers, especially
when it comes to their working habits with colleagues. These challenges can
have wide-ranging implications for both the Chinese workers themselves and
their interactions with local African colleagues.
In countries such as Ghana, local governments and communities often ex-
ert pressure on Chinese-funded projects to ensure that a significant portion of
the workforce is composed of local hires. This requirement reflects a desire to
maximize job opportunities for the local population and distribute the economic
benefits of Chinese investments. For example, in the construction of the Bui
Dam, the agreement between Sinohydro, the Chinese state-owned behemoth
contracted to complete the project, and the Ghanaian government stipulated
that a certain proportion of the workforce would be local.22 Chinese workers
may find it challenging to adapt to working alongside local colleagues who have
different cultural backgrounds, work practices, and expectations. Chinese work-
ers in Africa may struggle with language barriers when working with local col-
leagues who may speak different languages or dialects. Miscommunication can
lead to misunderstandings, errors, and strained working relationships. Accord-
ing to Business Insider, only 130 million of the approximately 1 billion people
in Africa speak English (13%).23 Moreover, these Chinese employees are not
fluent enough in English, which enhances their difficulties living and working in
Africa with severe communication problems.
Chinese and African cultures also often have different work ethics and expec-
tations regarding punctuality, productivity, and work hours. Chinese workers
may be accustomed to a more rigorous work schedule and faster pace, while
local colleagues may have a different approach. These differences in work habits
can lead to friction and misunderstandings. Among Chinese employees in Tan-
zania, the adaption to the material environment is the most successful, followed
by the adaption to life culture, while the adaption to work culture is poor.
More specifically, the Chinese employees believe that the overall environment
in African countries is kind and polite, but the scenes on streets, such as beg-
ging habits on the streets, and lazy working style are not acceptable.24 Chinese
workers’ adaptation to the local work culture may be poor, as suggested by the
example from Tanzania. Local work cultures in African countries can be signif-
icantly different from what Chinese workers are used to in terms of hierarchy,
decision-making processes, and the pace of work. This lack of adaptation can
hinder collaboration and productivity.
In summary, Chinese workers in Africa face various cultural barriers when it
comes to their working habits with local colleagues. These challenges encompass
22 Pippa Morgan, Andrea Ghiselli, “Chinese workers on Africa’s infrastructure projects: The

link with host political regimes”


23 George Feng, Xianzhong Mu, “Cultural challenges to Chinese oil companies in Africa and

their strategies”
24 Qingmin Li, “A study on cross-cultural adaptation of Chinese employees in Chinese-

funded enterprises in Tanzania”

73
issues related to local employment requirements, language barriers, differences
in work ethic, adaptation to local work culture, and cultural sensitivity. Over-
coming these barriers requires cultural awareness, effective communication, and
a willingness to adapt and collaborate across cultural boundaries, which are es-
sential for the success of Chinese-funded projects in Africa and the harmonious
coexistence of diverse workforce.

5 Potential Solutions
As China’s presence in Africa continues to grow, finding effective solutions to ad-
dress these challenges becomes increasingly crucial. This exploration delves into
innovative approaches and practical solutions aimed at improving the lives and
well-being of Chinese workers in Africa, while also promoting sustainable devel-
opment and fostering positive relations between the two regions. By addressing
these challenges comprehensively, we can pave the way for a more prosperous
and harmonious coexistence between Chinese workers and their African host
communities.
First, to effectively address the challenges faced by Chinese workers in Africa,
companies can play a pivotal role by implementing certain crucial provisions in
their employment contracts. To begin, it is imperative to establish compre-
hensive insurance coverage for employees, encompassing both health and life
insurance. Adequate coverage not only ensures access to healthcare in remote
areas but also offers financial security to workers and their families. Stipulat-
ing tax-free status for income earned in Africa can significantly enhance the
economic well-being of employees. Furthermore, companies should commit to
providing reliable transportation options and suitable living conditions to alle-
viate the challenges of remote work environments. Setting age limits for specific
job positions can ensure that employees are adequately equipped to handle the
physical demands of their roles, while also promoting safety and well-being. By
incorporating these conditions into employment contracts, Chinese companies
can proactively address the welfare and quality of life for their workers in Africa,
fostering a more conducive and harmonious working environment.
Second, implementing education sessions for Chinese workers in Africa is not
only feasible but also highly advisable, serving as a valuable recommendation
for companies operating on the continent. These sessions can be conducted by
local tutors or even experienced workers who have previously been employed in
the region. Such educational initiatives can cover a range of topics, including
language proficiency, cultural awareness, understanding forbidden zones, and
crucial safety instructions. By harnessing the expertise of local instructors or
experienced colleagues, companies can empower their workforce with the essen-
tial knowledge and skills needed to navigate the unique challenges of working in
African contexts. This proactive approach not only enhances the effectiveness
and efficiency of operations but also fosters a stronger sense of cultural integra-
tion and safety among Chinese workers, ultimately benefiting both employees
and the host communities.

74
From a business perspective, implementing educational sessions for Chinese
workers in Africa is a strategic investment that aligns with a company’s in-
terests on multiple fronts. These educational initiatives need not be costly,
making them a cost-effective way to equip workers with essential skills and
knowledge. By enhancing the attention span and engagement of employees,
companies can address the difficulty of lower productivity and efficiency often
associated with inadequate training and cultural adaptation. Lower turnover
rates among employees who have undergone such training are immensely ben-
eficial to businesses. The cost savings associated with reduced hiring and on-
boarding expenses, along with the preservation of valuable time and energy,
provide tangible benefits. Retaining experienced workers contributes to the
cultivation of expertise, ultimately leading to a more skilled and competent
workforce. Thus, these education sessions emerge as a win-win solution that
not only enhances workers’ capabilities but also proves highly advantageous for
a company’s long-term sustainability and profitability in the African context.
Third, the involvement of business organizations, such as chambers of com-
merce, in addressing the challenges faced by Chinese workers in Africa is a
valuable and collaborative solution. Chambers of commerce serve as associa-
tions or networks of business people dedicated to safeguarding and advancing
the interests of their members. These organizations often comprise business
owners sharing geographical or sectoral interests, and they can also have an
international scope. Companies operating in Africa can forge close partnerships
with chambers of commerce to navigate the complexities of the local business
environment effectively. These chambers often engage in regular dialogues with
local governments, facilitating the resolution of mutual challenges.
When conflicts or issues arise between African and Chinese workers, busi-
nesses can turn to chambers like the China-Africa Business Council (CABC)
for assistance. CABC can serve as a mediator and advisor, helping to find
amicable solutions to address societal conflicts within the workforce. One prac-
tical approach could involve conducting interviews or surveys among Chinese
employees to identify the most significant challenges they face in the business
context. This data can then be compiled into a comprehensive report, high-
lighting key concerns and areas that need attention. Subsequently, the CABC
can engage in discussions with national governments to propose further invest-
ment in projects that result in a ”win-win” situation for both Chinese workers
and local communities. For example, in regions where safety is a concern, col-
laborative efforts can be undertaken to improve the security of society. This
concern can be raised with the national government to initiate measures aimed
at enhancing safety and social well-being.

6 Conclusion
Life challenges faced by Chinese workers abroad that are inadequately docu-
mented and often overlooked by mainstream reporting is a significant issue that
deserves our attention. When these challenges are not thoroughly explored or

10

75
properly documented, they can have far-reaching consequences that affect indi-
viduals, communities, and even entire societies.
Whether it be hazardous working conditions, inadequate access to healthcare
services, or the need to adapt to a culturally diverse work environment, Chinese
workers confront many challenges that make it difficult for them to succeed. As
China continues to invest in its relationships with African countries, it should
also invest in the personal relationships between Africans and Chinese employ-
ees. Strengthening insurance requirements, providing additional training, and
improving communication through chambers of commerce will hopefully facili-
tate a mutually beneficial collaboration, addressing these challenges and creat-
ing an environment where both Chinese workers and African communities can
thrive and succeed together.
The recommendations above are even more urgently needed because of the
nearly 90% drop in Chinese workers in Africa due to COVID-19.25 Those who
remain are even more isolated than before and in even more need of assistance
from their employers. If Chinese workers are to return and continue building
China-Africa ties, they will need robust support from their home country.

References
[Afr14] Africa: China’s second continent. International Peace Institute, 2014.
[Bar23] Kate Bartlett. How chinese private security companies in africa differ
from russia’s. VOA, 2023.
[Bus18] Stephanie Busari. Employed by china. CNN, 2018.
[Ho15] Ufrieda Ho. Chinese in south africa learn to live with violence. South
China Morning Post, 2015.
[Hol23] Hereward Holland. Factbox: Recent coups in west and central africa.
Reuters, 2023.
[Kem18] Laurent Kemoe. How africa can escape chronic food insecurity amid
climate change. IMF, 2018.
[Mor23] Pippa Morgan. Chinese workers on africa’s infrastructure projects:
The link with host political regimes. Phys.org, 2023.
[Nan20] Paul Nantulya. Chinese security contractors in africa. Carnegie En-
dowment for International Peace, 2020.
[Nan21] Paul Nantulya. Chinese security firms spread along the african belt
and road. Africa Center for Strategic Studies, 2021.
[rep20] GT reporters. Chinese in s. africa fear for safety amid rising murder
cases. Global Times, 2020.
25 China Africa Research Initiative, “DATA:CHINESE WORKERS IN AFRICA”

11

76
[Rig22] United Nations Human Rights. South africa: Un experts condemn
xenophobic violence and racial discrimination against foreign nation-
als. OHCHR, 2022.

[Sha18] Haydn Shaughnessy. Chinese companies are transforming busi-


ness—and the west is struggling to keep up. BRINK – Conversations
and Insights on Global Business, 2018.

[Sow18] Mariama Sow. Figures of the week: Chinese investment in africa.


Brookings, 2018.

[WHO17] WHO. Food safety. Regional Office for Africa, 2017.

[Zou23] Li Zou. Study on the use and effectiveness of malaria preventive


measures reported by employees of chinese construction companies
in western africa in 2021. BMC Public Health, 2023.

12

77
Unlocking Quercetin’s Therapeutic Potential:
The Use of Innovative Drug Delivery Strategies
to Remedy Neurodegenerative Disorders
∗†
Andrew Lee
October 10, 2023

Abstract
Neurodegenerative disorders are a class of diseases characterized by
the degeneration of certain parts of the central and peripheral nervous
system, affecting millions of people worldwide. The use of quercetin for
the potential treatment of neurodegenerative disorders has been heavily
researched due to the flavonoid’s antioxidant, anti-inflammatory, metal-
ion chelating, and neuroprotective properties. However, free quercetin
struggles to make a significant clinical impact due to low aqueous sol-
ubility, chemical instability, and an unfavorable absorption profile, ulti-
mately leading to underwhelming levels of bioavailability. To ameliorate
these issues, drug delivery systems have been employed in modern re-
search, including polymer-based nanoparticles, lipid-based nanoparticles,
and metallic nanoparticles. This review aims to discuss modern research
on quercetin’s potential in neurodegenerative disorder treatment, partic-
ularly with these drug carriers, and identify the most promising configu-
rations for future investigation. Most notably, studies showed that drug
carriers for quercetin’s delivery increased the flavonoid’s bioavailability,
likely due to protective mechanisms against bodily chemical degrada-
tion. Additionally, many studies also found that drug carriers signifi-
cantly extended the duration of quercetin’s release within the body, al-
lowing for less frequent administration of the flavonoid during treatment
periods. Finally, drug delivery systems illustrated the facilitation effects
of quercetin’s blood-brain barrier crossing—an essential step in treating
neurodegenerative disorders. Though the use of quercetin-loaded drug
carriers for neurodegenerative disorder treatment is still a relatively new
topic of study, certain configurations have shown tremendous potential.
Most notably, liposomal delivery systems are especially promising can-
didates, and future studies should investigate their use in tandem with
PEGylations for quercetin’s neurodegenerative applications.
∗ Student at Tenafly High School in Tenafly, New Jersey
† Advised by: Dr. Paul Gehret of the University of Pennsylvania

78
1 Introduction
Quercetin (3,3’,4’,5,7-pentahydroxyflavone) is a widely abundant dietary polyphe-
nol and flavonoid found in a wide array of fruits, vegetables, and their respec-
tive derivatives. Quercetin is most profusely found in berries, leafy greens,
citrus fruits, onions, apples, red wine, and green tea [BL22] [SA07] [ADAP16]
[CCE+ 03] [SNS+ 13]. Throughout the past few years, quercetin has been heavily
researched and implemented in both food products and pharmaceuticals alike
due to the wide variety of promising health benefits exhibited [LW22] [PVK+ 22].
One of quercetin’s most prominent properties is a free radical scavenging ability,
allowing for unpaired electron neutralization, which is notorious for inducing in-
flammation and oxidative stress [ADB+ 89]. These very stressors gradually wear
down the body and eventually deteriorate enough to the point where disorders
and diseases can either be caused directly by the damage or be predisposed to
the body’s weakened state. The most common free radical-induced maladies in-
clude viral infections [Aka01], neurodegenerative disorders [LBBD17], diabetes
[MSWI03], cardiovascular complications [ML97], and various cancers [VRM+ 06]
[DJ96] [RBS+ 00]. As a result, among other factors, quercetin has been found to
most significantly display anti-inflammatory [KVM+ 11], antioxidant [HCS+ 18],
anticancer [HDFY+ 17], antidiabetic [SVKP18], antimicrobial [WYZ+ 18], an-
tiviral [SLL+ 21], hepatoprotective [EAPA+ 17], and neuroprotective effects in
vivo [BGP+ 20].
In the past decade, quercetin has seen a rise in preclinical and clinical re-
search for potential treatments of neurodegenerative disorders, including Alzheimer’s
Disease [MPSS+ 17], Parkinson’s Disease [SWM+ 12], Huntington’s Disease [CSD+ 13],
Amyotrophic Lateral Sclerosis (ALS) [BMSD20], and Multiple Sclerosis (MS)
[AEG+ 23]. As a result of quercetin’s free radical scavenging and neuroprotec-
tive properties, quercetin has exhibited the ability not necessarily to reverse
the effects of neurodegenerative disorders but rather to hinder and mitigate
their progression within the body. As neurodegenerative disorders are most
often characterized by neurological degradation and are closely linked to old
age, slowing down their progression can play a tremendous role in increasing
the life expectancy of those who suffer. However, despite the myriad of phar-
maceutical benefits that quercetin possesses, the flavonoid alone has struggled
to make a significant clinical impact due to low aqueous solubility, chemical
stability, and an unfavorable absorption profile, ultimately leading to an under-
whelming bioavailability [MMD+ 97]. Quercetin’s hydrophobic properties, for
one, make absorption into the bloodstream extremely difficult. Additionally,
the highly reactive and pH-sensitive nature of quercetin prone it to chemical
alterations when passing through the acidic environment of the gastrointestinal
tract, undergoing deprotonation, which furthers the flavonoid’s lack of solubility
and bioavailability [ZZM21]. In attempts to ameliorate this issue, recent stud-
ies and developments in quercetin applications encompass a wide range of drug
delivery systems for the flavonoid, including lipid-based nanoparticles, polymer-
based nanoparticles, and metallic nanoparticles [VM19]. In this review, it will
be discussed how the numerous pharmacological properties of quercetin can

79
be harnessed through the use of novel drug delivery systems for the potential
treatment of neurodegenerative disorders.

2 Chemical Structure and Properties of Quercetin


Flavonoids are a class of polyphenolic compounds structurally characterized by
three aromatic rings, one of which is usually heterocyclic [SJC+ 21]. Among
them are many subclasses with varying functional groups, including Flavones,
Flavonols, Isoflavones, Flavanones, Anthocyanins, and Flavan-3-ols. Quercetin,
a flavonol, distinguishes itself from the other flavonoids through its unique chem-
ical structure, boasting five hydroxyl groups (circled in red) in addition to a
ketone group (circled in green) (Fig. 1) [MS19]. This unique structure provides
quercetin with an expansive set of chemical properties fit to combat neurode-
generative disorders.

Figure 1: Chemical structure of quercetin (C15 H10 O7 ). Hydroxyl groups (OH- )


circled in red. Ketone group (C=O) circled in green.

2.1 Antioxidant Activity


The antioxidative properties of quercetin stem directly from the flavonoid’s free
radical scavenging ability [MCM+ 98]. Free radicals, most commonly found in
the form of reactive oxygen species (ROS) and reactive nitrogen species (RNS),
are molecules containing one or more unpaired electrons [PKVP10]. As highly
unstable species that seek stability, free radicals readily attack cells and tis-
sues in search of additional electrons to pair with their lone electron. As a
result, new free radicals are formed, setting off a chain reaction that creates
a system of oxidative stress [Aru98]. Over time, these unpaired electrons can
induce detrimental cellular and tissue damage, leading to a variety of different
complications [Ril94]. However, this accumulation of oxidative stress within
the body can easily be mitigated and even prevented through the effects of an-
tioxidants [Aru98]. Quercetin, with its five hydroxyl groups, targets these free
radicals, neutralizing them by donating hydrogen ions [PHH99]. Additionally,

80
the creation of new free radicals is eliminated as the double bonds and ketone
groups present in quercetin’s chemical structure establish a delocalization of
electrons, allowing for the loss of charge to be evenly distributed throughout
the molecule [WSRE04]. As a result, oxidative stress is unable to amass within
the body, lowering the risk of free radical-induced maladies.

2.2 Anti-inflammatory Activity


Quercetin’s anti-inflammatory strength can be attributed to a combination of
the flavonoid’s free radical scavenging abilities and inhibitory capacities. In-
flammation within the body is a necessary defense mechanism against harm-
ful stimuli, including pathogens, toxins, damaged cells, and irradiation. How-
ever, inflammation that continues over an extended period of time can be-
come detrimental. Paradoxically, prolonged inflammation can lead to severe
cell and tissue damage, ultimately aiding in the development of various mal-
adies [AUID+ 07] [MKMH15]. However, such states of inflammation can be
combated through the use of anti-inflammatory agents such as quercetin. As
free radicals play a significant role in the induction of damage to cells and
tissues through oxidative stress, quercetin’s free radical scavenging ability can
play a tremendous role in the flavonoid’s anti-inflammatory prowess [LBS+ 18].
Additionally, quercetin possesses the ability to inhibit several inflammatory en-
zymes and mediators, as well as block inflammatory receptor sites, furthering
its anti-inflammatory strength [GBS+ 15] [Chi10]. In order for inflammation
to form, a series of events must occur. To begin, upon detection of harmful
stimuli, immune cell receptors will initiate intracellular signaling pathways in
attempts to activate transcription factors such as nuclear factor-kappa B (NF-
κB) [LZJS17]. These transcription factors enter the nucleus of immune cells
and bind to specific DNA sequences, creating pro-inflammatory genes encoding
specific enzymes, including cyclooxygenases (COX) and lipoxygenases (LOX).
Upon synthesis, these enzymes are released from the immune cell and produce
inflammatory mediators, including prostaglandins and leukotrienes [BBW80].
These inflammatory mediators are responsible for recruiting new immune cells.
By binding to surrounding immune cells, the mediators can restart the process,
making the production of inflammation cyclic [AAA+ 18]. However, quercetin
can inhibit the signaling pathway of this cycle through two main methods: the
downregulation of a specific gene’s expression and the direct binding to receptor
sites. By directly binding to transcription factors such as nuclear factor- kappa
B (NF-κB), quercetin can downregulate pro-inflammatory gene expression and
minimize enzyme levels in inflamed tissues, decreasing the immune cell count
and the overall inflammation in a given area [VHS+ 11]. Furthermore, by bind-
ing directly to specific receptor sites, quercetin can completely block certain
events from occurring, including the binding of inflammatory mediators to im-
mune cell receptors, which is the step responsible for restarting the process of
inflammation generation [KG99]. It is through these activities that quercetin
acts as an anti-inflammatory agent.

81
2.3 Metal Ion Chelating Activity
Quercetin exhibits a strong metal ion chelating ability due to its unique chemical
structure. Metal ions, such as potassium and iron, have shown dietary benefits
in trace amounts and are essential for human life. However, an oversupply of
these metallic ions and notably even trace amounts of heavy metal ions, such
as lead and mercury, can be highly toxic and, in some cases, lethal [KYSO07].
Similar to free radicals, these metallic ions are highly unstable and can catalyze
the creation of reactive oxygen species (ROS), generating a system of oxidative
stress [Sta90]. To combat the detrimental effects of these metallic ions, metal
ion chelators such as quercetin can be employed. Quercetin possesses the ability
to act as a ligand and form coordinate covalent bonds with metal ions, creating
quercetin-metal complexes [RRD14]. By donating hydrogen ions from hydrox-
ide groups, quercetin is able to neutralize the charge of the metal ion (Fig.
2). Multiple quercetin molecules often contribute to this effort, sequestering
the metal ion within the complex, which protects it from the redox reactions
that generate reactive oxygen species (ROS) (Fig. 2). Through these chelating
mechanisms, quercetin is able to stop the detrimental effects of metal ions.

Figure 2: Quercetin-Aluminum Ion Chelation. Three quercetin molecules create


a coordination complex with the aluminum ion, which has a charge of +3.

2.4 Other Neuroprotective Activities


In addition to quercetin’s antioxidative, anti-inflammatory, and metal ion chelat-
ing abilities, the flavonoid possesses a few other neuroprotective properties that
make it fit to combat neurodegenerative disorders. For one, quercetin acts as a
mitochondrial protective agent, protecting cellular mitochondria from damage.

82
As mitochondria are the energy-producing organelles in cells, including neurons,
mitochondrial dysfunction can be detrimental to neuronal health [FBS+ 07].
By aiding in the maintenance of mitochondrial membrane potentials through
the encouragement of mitochondrial biogenesis-inducing genes, including per-
oxisome proliferator-activated receptor-gamma coactivator 1-alpha (PGC-1α),
in addition to discouraging the buildup of oxidative stress and damage, quercetin
can drastically reduce neuronal apoptosis rates, showcasing its neuroprotective
prowess [XWG+ 16]. Another way quercetin can promote neuronal health is by
enhancing autophagy, which prevents the buildup of neurotoxic proteins and
other harmful cellular components [WLL+ 11]. The amassing of misfolded pro-
teins within the brain often characterizes neurodegenerative disorders, and by
binding to and modulating specific proteins such as AMP-activated protein ki-
nase (AMPK) and Beclin-1, quercetin can stimulate autophagy within neurons,
further promoting their health [WLL+ 11] [KAAS16]. Ultimately, a combination
of all of quercetin’s neuroprotective properties aids the flavonoid in combating
neurodegenerative disorders.

2.5 Bioavailability Struggles


Despite the multitude of pharmaceutically beneficial properties quercetin pos-
sesses, the flavonoid has struggled to make a significant impact in the realm of
pharmaceutics and dietary supplements for three main reasons: low aqueous sol-
ubility, chemical instability, and poor absorption profile [CFD+ 13]. Quercetin is
a hydrophobic compound sparingly soluble in aqueous environments, including
the human body. As doses are most frequently administered orally, quercetin
is unable to be dissolved, often aggregating to form crystals too large to pass
through the intestinal epithelium and be absorbed into the bloodstream. How-
ever, even reaching the intestinal epithelium would be a feat for quercetin, as the
flavonoid also struggles to cross through the acidic environment of the gastroin-
testinal tract. Due to the surplus of hydrogen ions that attack quercetin’s hy-
droxyl groups en route to absorption, the flavonoid structurally degrades, further
reducing its bioavailability [KTMC22]. Finally, quercetin is transported to the
liver upon absorption, where it often undergoes rapid metabolism and further
breakdown, preparing it for bodily excretion rather than circulation [HZL+ 20].
In attempts to ameliorate quercetin’s struggles, modern research has turned to
drug delivery encapsulation systems, investigating their potential use in improv-
ing quercetin’s bioavailability without compromising therapeutic potential.

3 Drug Carriers for the Delivery of Quercetin


Modern drug delivery applications encompass a wide variety of methodologies,
but in regard to quercetin, research has focused on oral delivery through the
use of nanoparticles and nanoformulations. As quercetin’s primary struggles
are aqueous insolubility and a lack of chemical stability, research has aimed
to find carriers that remedy these issues while simultaneously preserving and

83
enhancing the flavonoid’s health benefits [KTMC22]. The most successful de-
livery systems concerning these efforts include polymer-based, lipid-based, and
metallic nanoparticles [VM19].

3.1 Polymer-based Nanoparticles


3.1.1 PLGA & PLA
Polymer-based nanoparticles are among the most widely researched and im-
plemented nanoparticle variations in drug delivery due to the chemical and
structural manipulability of polymers in addition to their cell targeting abil-
ities [IZS+ 20]. The most popular polymers used in nanomedicine, including
the delivery of quercetin, are polylactic acid (PLA) and poly(lactic-co-glycolic
acid) (PLGA) due to their extreme versatility as drug carriers. Both deriva-
tives of lactic acid, PLA and PLGA share a copious number of properties and
abilities, such as their biodegradable and biocompatible natures [LS13]. No-
tably, PLGA, a copolymer composed of both lactyl and glycolyl groups, can be
engineered to improve the polymer’s solubility by increasing the ratio of gly-
colyl to lactyl groups [PWD+ 04]. However, while PLA and PLGA can differ
in solubility profiles, both polymers offer quercetin protection from chemical
degradation within the body [LS13]. As the polymers completely encapsulate
quercetin as a nanoparticle, the acidic environment of the gastrointestinal tract
and other reactive species are no longer a problem for the flavonoid [PQF+ 12].
Resultantly, the bioavailability of quercetin can be improved. Pool et al. used
quercetin-encapsulated PLGA nanoparticles, approximately 400 nm in diame-
ter, prepared by solvent displacement to illustrate improvements in quercetin’s
antioxidative properties when loaded in PLGA compared to its free counterpart.
PLGA-quercetin nanoparticles were found to have greater inhibition of nitroblue
tetrazolium (NBT) reduction in vitro (Fig. 3), which could potentially trans-
late to higher levels of quercetin bioavailability in vivo [PQF+ 12]. In that same
study, quercetin-loaded PLGA molecules also showed tremendous improvements
in Fe2+ ion-chelating activity, albeit taking more time to reach such effects (Fig.
4). While free quercetin’s chelating activity diminished quickly after adminis-
tration, PLGA-quercetin nanoparticle’s chelating activity exhibited a gradual
increase in intensity over a 32 hour period (Fig. 4), indicative of the controlled
release potential of quercetin loaded PLGA nanoparticles [PQF+ 12]. Similarly,
Di Cristo et al. used PLA-quercetin nanofibers fabricated through electrospin-
ning in order to depict the controlled release properties that PLA can offer in
the delivery of quercetin. PLA-quercetin nanoparticles were found to exhibit
free radical 2,2-diphenyl-1-picrylhydrazyl (DPPH) scavenging activity up to 48
hours after in vitro administration of the drug [DCVDL+ 22]. Overall, PLA and
PLGA nanoparticles have shown tremendous clinical efficacy in their delivery
of quercetin.

84
Figure 3: Percentage inhibition of nitroblue tetrazolium (NBT) reduction of free
catechin (CAT), free quercetin (QC), catechin-loaded PLGA nanoparticles, and
quercetin-loaded PLGA nanoparticles. Measurements taken at three different
concentrations: 7 µM, 21 µM, and 35 µM. Free PLGA nanoparticles were used
as control [PQF+ 12].

Figure 4: Fe2+ ion chelating activity of free quercetin and PLGA encapsulated
quercetin at 20 µM and 100 µM. Free PLGA nanoparticles were used as a
control. Chelating activity was measured after 0.25 hours, 4 hours, 12 hours,
24 hours, and 32 hours [PQF+ 12].

85
3.1.2 Chitosans & Polyethylene Glycols (PEGs)
In addition to PLA and PLGA, a variety of polymer-based nanoparticle de-
livery systems have shown potential for quercetin delivery, including chitosan
nanoparticles and polyethylene glycol conjugations (PEGs) [WSM+ 16]. Chi-
tosans are natural polysaccharides that share many of the same properties
that PLA and PLGA exhibit, such as drug protective and bioavailability en-
hancing abilities, making the polymer a strong candidate for drug delivery
applications [DJ18]. Additionally, due to their chemical structure, chitosans
are entirely hydrophilic, resulting in high aqueous solubility through hydrogen
bonding from its hydroxyl and amine groups. However, what sets chitosans
apart from other polymers as a drug delivery system is its mucoadhesive prop-
erties, which can tremendously boost quercetin’s bioavailability [SWK08]. As
a result of the amine groups’ lone electron pair, chitosans possess a positive
charge, awarding the polymer an affinity for negatively charged mucosal sur-
faces (e.g., the lining of the gastrointestinal tract). Through electrostatic at-
tractions, the chitosan nanoparticle becomes tightly bound to the intestinal
epithelium, allowing for effective quercetin absorption into the bloodstream,
thus increasing the flavonoid’s bioavailability [SWK08]. Baksi et al. demon-
strated this increase, measuring significantly lower IC50 levels in A549 and
MDA MB 468 tumor cell lines for quercetin-chitosan nanoparticles, which were
prepared by ionic gelation, compared to free quercetin [BSB+ 18]. Additionally,
quercetin-loaded chitosan nanoparticles showed more significant reductions in
tumor volume and weight in vivo [BSB+ 18]. In another study, Mukhopadhyay
et al. showed tremendous drops in blood glucose levels in HT29 cell lines in
vitro using quercetin-succinylated chitosan-alginate core-shell-corona nanopar-
ticles [MMM+ 18]. While various particle size groups were tested, the smallest
group, approximately 91.58 nanometers in size, was found to be most efficient for
quercetin’s oral delivery [MMM+ 18]. On the other hand, polyethylene glycols
(PEGs), while sharing many of the same chemical properties as the other poly-
mers, play a unique role in drug delivery applications. PEGs, like chitosans,
are hydrophilic polymers that have high water solubility and biocompatibil-
ity, making them excellent candidates for drug delivery applications. However,
PEGs are non-biodegradable polymers, limiting their use alone as a drug car-
rier. To remedy this issue, researchers have begun to use novel PEG conjuga-
tions where PEG is used in concert with other nanoparticles to improve the
polymer’s biodegradability while being able to harness PEG’s unique chemical
properties for drug delivery applications. By attaching PEG chains to the sur-
face of other nanoparticles in a process titled PEGylation, the polymer is further
able to protect and stabilize the delivery system, creating a stealth effect for the
nanoparticle [LJS+ 21]. These long PEG chains sterically hinder the nanoparti-
cle due to their large yet flexible nature, physically obstructing anything from
binding to the carrier’s surface. Likewise, Li et al. found that longer attached
PEG chains correlate to fewer interactions between the nanocarrier and other
cells within the body [LJS+ 21]. Additionally, due to their hydrophilic nature,
these PEG chains create a film-like layer of water that encapsulates the car-

86
rier, protecting it from protein adsorption and phagocytosis, which ultimately
increases the duration of quercetin’s bioavailability. Qureshi et al. used PE-
Gylated PLGA-quercetin nanoparticles, which were prepared through double
emulsion encapsulation, to show cell viability inhibition of cell line MDA-MB-
231 in vitro [QZW+ 16]. Additionally, in vitro tumor targeting and growth inhi-
bition were shown with tremendous success when doxorubicin was co-delivered
with quercetin, illustrating the nanoparticle’s potential in targeted drug deliv-
ery [QZW+ 16]. Ultimately, despite their lack of biodegradability, PEGs have
shown tremendous promise in the field of targeted and controlled-release drug
delivery.

3.2 Lipid-based Nanoparticles


Lipid-based nanoparticles are another widely researched type of nanoparticle
for quercetin delivery due to their biocompatibility and versatility as a drug
carrier, among the most popular, including liposomes and micelles. These lipid-
based nanoparticles structurally differ, exhibiting unique chemical properties
resulting in various potentials in drug delivery applications. Liposomes are
a variation of lipid-based nanoparticles consisting of a lipid bilayer, that is,
two layers of lipid molecules [ARSD+ 13]. As lipid molecules are amphiphilic,
their heads hydrophilic and tails hydrophobic, the outer layer contains lipid
molecules with their heads facing outwards, creating a hydrophilic surface for
increased nanoparticle dispersion in water, while the inner layer contains inward-
facing lipid molecules, creating an aqueous core for encapsulating hydrophilic
substances (Fig. 5). Additionally, due to the hydrophilic nature of the lipid
molecules’ tails, an additional hydrophobic section is created between the two
layers for the potential encapsulation of hydrophobic substances (Fig. 5) [ARSD+ 13].
Due to this unique bilayer structure, liposomes are able to increase the solubility
and stability of quercetin. Additionally, the lipid bilayer also awards liposomes
high encapsulation efficiency, which can lead to better controlled and sustained
release of quercetin over a long period of time [AABM+ 19]. Patel et al. used
liposome-quercetin nanoparticles prepared by thin-film hydration to illustrate
this potential for sustained release [PTK+ 20]. While these liposomal quercetin
nanoparticles showed significant improvements in breast cancer tumor reduction
potentials compared to their free counterpart, they, more importantly, exhibited
this behavior consistently over a 30-day period in vitro despite only three admin-
istrations of the drug throughout the duration [PTK+ 20]. Priprem et al. used
a similar method of encapsulation to create liposome-quercetin nanoparticles
and experimented with their cognitive-enhancing properties in vivo [PWS+ 08].
Researchers administered these liposomal nanoparticles to rats and had them
traverse through a water maze, training them to memorize the location of hidden
platforms within the labyrinth repeatedly over a 28-day period. By measuring
the time it took the rats to find the platform during each trial, researchers were
able to gather data on the cognitive enhancing abilities of quercetin. The rats
who had been orally administered quercetin-loaded liposomes exhibited similar
acquisition times to other groups, but curiously enough, those who had been ad-

10

87
ministered the drug intranasally saw tremendous improvements in their platform
acquisition time [PWS+ 08]. This could potentially be indicative of a blood-brain
barrier crossing struggle that the oral administration of liposomes faces, which
intranasal delivery can remedy. In addition to liposomes, micelles are another
lipid-based nanoparticle that has been heavily researched for quercetin delivery.
Micelles are amphiphilic molecules that structurally contain only one layer of
lipids. These lipid molecules face outwards, creating a hydrophilic surface like
that of liposomes but a hydrophobic core more favorable for quercetin delivery
(Fig. 5) [Men79]. Even so, micelles struggle in vivo compared to liposomes due
to their thinner outer shell, which leads to decreased drug protective proper-
ties and stability as a nanoparticle. As such, micelles are sensitive to stimuli,
spontaneously dissociating when faced with rapid temperature and pH changes,
contributing to their unpredictable drug-release properties [WCZZ09] [GLL13].
However, micelles have shown promise in quercetin delivery when coupled with
polyethylene glycols (PEGs) through PEGylations. Lv et al. employed thin-
film hydration to create PEGylated quercetin-loaded micelles in rats and found
that the blood plasma quercetin concentrations of those who had been adminis-
tered the PEGylated micelle conjugation were significantly greater. Even more
notable was that the PEGylated micelle rats also maintained quercetin within
their bloodstream for almost 50 hours, quadrupling the duration of those ad-
ministered free quercetin [LLL+ 17]. Using a similar formulation technique, Qi
et al. created and used PEGylated quercetin-loaded micelles and found signif-
icant H22 cell line tumor growth inhibition as well as reduction potentials for
existing tumors in vivo [QGY+ 22]. These effects lasted up to 15 days after treat-
ment, also illustrating the controlled release potential of PEGylated micelles in
quercetin delivery [QGY+ 22]. Though often outclassed by polymeric nanopar-
ticles in quercetin delivery applications, lipid-based nanoparticles certainly have
their place within the field.

Figure 5: (i) Diagram of a Liposome. Lipid bilayer creates a hydrophilic surface


and core in addition to a hydrophobic inner-bilayer area. (ii) Diagram of a
Micelle. Lipid monolayer creates a hydrophilic surface and a hydrophobic core.

11

88
3.3 Metallic Nanoparticles
Metallic nanoparticles are another form of drug carriers that have been re-
searched for their potential in quercetin delivery. While most metals, par-
ticularly heavier transition metals, such as lead and mercury, have exhibited
non-biodegradable and cytotoxic properties within the body [KYSO07], certain
metals, such as iron and silver, have proven to be more biocompatible, making
them viable candidates for drug delivery [BMTC22] [JRM+ 08]. In addition to
increasing the solubility of quercetin through encapsulation, these biocompati-
ble metallic nanoparticles have also exhibited protective effects, increasing the
flavonoid’s stability [KMZM03]. Unfortunately, due to the high reactivity of
metals within the body, these metallic nanoparticles are susceptible to break-
down due to temperature and pH variations [LCK16]. However, this reactivity
can work in favor of metallic nanoparticles as the property makes them easy to
manipulate and engineer chemically. Most notably, functional groups and other
biomolecules, such as antibodies and peptides, can be attached to the nanopar-
ticle’s surface for targeted delivery to specific cells and molecules. Additionally,
other molecules such as polymers, surfactants, and ligands can be incorporated
into the surface of metallic nanoparticles, partially remedying their extreme re-
activity and enhancing the overall stability of the nanoparticle [ZLA+ 11]. Na-
jafabadi et al. used a novel iron drug carrier system, quercetin conjugated iron
oxide nanoparticles (QT-SPION), for the delivery of quercetin in vivo and, using
high-performance liquid chromatography (HPLC), which showed tremendous
increases in quercetin concentrations within the brain tissue of rats [NKE+ 18].
Additionally, negligible effects of iron concentrations within the brain and blood
plasma were shown. Ultimately, through the use of the quercetin-iron oxide
nanoparticle, the crossing of the blood-brain barrier was improved, which was
shown by increased quercetin concentrations in the brain [NKE+ 18]. Metallic
nanoparticles have their benefits and downfalls in quercetin delivery, but they
have certainly shown promise in targeting neurodegenerative disorders.

4 Applications of Quercetin Delivery for the Treat-


ment of Neurodegenerative Disorders
Neurodegenerative disorders are a class of disorders characterized by neuronal
degeneration within the central or peripheral nervous system [ENM+ 11]. The
likelihood of neurodegenerative disease development dramatically increases with
respect to age [BK01], and the most common diseases (e.g., Alzheimer’s Disease
and Parkinson’s Disease) plague millions of people across the world [N/A16].
Though modern research has focused on developing treatments for neurodegen-
erative disorders, there are no cures that currently exist [VLA+ 18]. However,
bioactive flavonoids, such as quercetin, have been found successful in slowing the
progression of these disorders within the body. As free radicals and oxidative
stress are a large factor of neurodegeneration, quercetin’s free radical scaveng-
ing and antioxidative properties aid the flavonoid in combating neurodegener-

12

89
ative disorders [AAAH86]. Additionally, quercetin’s anti-inflammatory, metal
ion chelating, and overall neuroprotective abilities play a role in slowing down
neurodegeneration [DJM13]. Finally, variations of quercetin-loaded nanopar-
ticles have demonstrated the ability to cross through the blood-brain barrier
by improving the bioavailability of the flavonoid [RMS+ 20]. Ultimately, it is
quercetin’s unique repertoire of chemical properties that allows it to combat
neurodegenerative disorders, including Alzheimer’s Disease, Parkinson’s Dis-
ease, Huntington’s Disease, Amyotrophic Lateral Sclerosis (ALS), and Multiple
Sclerosis (MS).

4.1 Alzheimer’s Disease


Alzheimer’s Diseases is a disorder characterized by the degeneration and death
of neurons within the brain, often leading to cognitive impairments and mem-
ory loss [LHS18]. In addition to free radicals and oxidative stress, the onset
of Alzheimer’s can be attributed to the accumulation of protein aggregates,
including amyloid-β plaques and tau tangles [SS11] [WBM21]. The former,
amyloid-β, is a peptide that can clump together with other biomaterials to
create deposits that can disrupt neuronal communication, inducing a cytotoxic
effect on the cells [SS11]. Tau, on the other hand, is a protein responsible
for nutrient transport in neurons. However, when these tau proteins misfold
into abnormal shapes, they create neurofibrillary tangles, which can disrupt
the transport of essential nutrients within neurons, causing their dysfunction
and ultimate death [WBM21]. To remedy these protein aggregates, quercetin’s
neuroprotective properties can be employed. By binding to and modulating
proteins such as AMPK-activated protein kinase, quercetin can activate and en-
hance autophagy-related pathways, working to clear and prevent the amassing
of amyloid-β peptides and tau proteins, which ultimately hinders the progression
of Alzheimer’s disease [WLL+ 11]. Additionally, as metal ions, such as copper,
zinc, and iron, have been found to contribute to the formation of amyloid-β
plaques and tau tangles, quercetin’s metal ion chelating ability can also be use-
ful in preventing these formations [ADB+ 89]. Sun et al. showed significant and
consistent amyloid-β 42 aggregation inhibition over the course of 60 hours in
vitro using PLGA-quercetin nanoparticles prepared through double emulsion-
solvent evaporation. In that same study, an MTT assay showed that the PLGA-
quercetin nanoparticles had an inhibitory effect on amyloid-β 42 -induced cyto-
toxicity, which was most significant from concentrations 5-40µg/mL [SLZ+ 16].
Pinheiro et al. used quercetin-loaded solid lipid nanoparticles to show a sim-
ilar inhibitory effect of the aggregation of amyloid-β(1-42) even just 24 hours
after administration in vitro [PGL+ 20]. Though little quercetin drug deliv-
ery research has been done in vivo, it certainly has potential, and considering
that the drug delivery of quercetin for therapeutic purposes is a relatively new
concept, its use for the treatment of Alzheimer’s is extremely promising.

13

90
4.2 Parkinson’s Disease
Parkinson’s Disease is a disorder characterized by a neurological undersupply of
dopamine, most often due to the degeneration of dopamine-producing neurons
located in the brain’s substantia nigra [DP03]. As dopamine is responsible for
facilitating smooth and coordinated motor movements, an undersupply of the
neurotransmitter can cause difficulties in executing simple motor movements,
leading to bradykinesia, involuntary tremors, muscle rigidity, and a variety of
other symptoms [DP03]. While free radical-induced oxidative stress can play a
role in dopamine-producing neuronal death, the aggregation of misfolded protein
α-Synuclein within the brain can also be a monumental factor in the pathogen-
esis of Parkinson’s Disease [BWU12]. By enhancing the autophagy of misfolded
α-Synuclein proteins and aiding in the maintenance of mitochondrial membrane
potentials, quercetin can prevent the death of dopamine-producing neurons
and mitigate the onset of Parkinson’s. Wang et al. administered quercetin
in vitro in 6-hydroxydopamine-treated PC12 cells and showed increased lev-
els of dysfunctional mitochondria and α-Synuclein autophagy [WHH+ 21]. In
that same study, in vivo oral administrations of quercetin over 14 days to 6-
hydroxydopamine-lesioned parkinsonian rats showed inhibitory effects of reac-
tive oxygen species (ROS) levels and free radical generator malondialdehyde
(MDA) levels in addition to improvements in ROS metabolizer superoxide dis-
mutase (SOD) levels (Fig. 6), exhibiting promise for Parkinson’s treatment
[WHH+ 21]. Karuppagounder et al. orally administered quercetin to rotenone-
induced hemi-parkinsonian rats over a period of 4 days and found significant and
consistent increases in dopamine levels within the brain [KMP+ 13]. However,
though free quercetin showed promise in rat models of Parkinson’s Disease, the
flavonoid’s effectiveness would likely not be as pronounced in human adminis-
trations, resulting in large excess doses of quercetin having to be administered in
order to attain the desired effect. Through the use of drug delivery, quercetin’s
properties can be more efficiently harnessed, enhancing the treatment potential
of Parkinson’s Disease.

Figure 6: Effects of quercetin administrations at concentrations 10 and 30


mg/kg on 6-hydroxydopamine-lesioned parkinsonian rats. (A) Malondialdehyde
(MDA) levels. (B) Reactive oxygen species (ROS) levels. (C) Superoxide dis-
mutase (SOD) levels [WHH+ 21].

14

91
4.3 Huntington’s Disease
Huntington’s Disease is a disorder characterized by the degeneration of nerve
cells in a part of the brain known as the basal ganglia, leading to a decline
of motor control and cognitive ability [ABF+ 00]. Symptomatically, Hunting-
ton’s Disease shares similarities with Parkinson’s Disease as they both hinder
smooth, coordinated movements, but while Parkinson’s is associated with the
undersupply of dopamine, the pathogenesis of Huntington’s has been found to be
genetic. Huntington’s Disease is caused by a mutation in the HTT gene, which
is responsible for encoding the huntingtin protein [PRY+ 19]. The resultant pro-
tein is significantly longer as a result of the mutations, proning the protein to
misfolding and aggregation. Encouraging the buildup of oxidative stress and mi-
tochondrial dysfunction, the accumulation of mutated huntingtin proteins can
become cytotoxic to neurons [PRY+ 19]. However, quercetin has shown potential
in combating Huntington’s Disease. By enhancing mutated huntingtin protein
autophagy and anti-inflammatory activity within the brain, quercetin can slow
the onset of Huntington’s. Additionally, the flavonoid can encourage the bio-
genesis of mitochondria, helping to mitigate previously inflicted neuronal dam-
age [DMCD09]. Sandhir et al. orally administered quercetin to 3-nitropropionic
acid-induced models of Huntington’s diseased rats over the course of 21 days
and tested their motor movement and control by measuring their performance
on a balance beam test. Researchers found that quercetin administration over-
time led to improved motor movement control and balance, as measured by
faster balance beam completion times as well as fewer paw slips during the
test [SM13]. Chakraborty et al. found that oral quercetin administration over
four days showed similar increases in motor movements in 3-nitropropionic acid-
induced rat models of Huntington’s, which was measured by increases in stride
as well as higher rates of success in completing an obstacle course compared
to their untreated counterparts [CSD+ 13]. Still, quercetin administration for
the treatment of Huntington’s can be improved. Though both studies showed
improvements in Huntington’s symptoms as a result of quercetin administra-
tion, the rate of administration was high—every day for 21 days for the former
study, while twice a day for four days for the latter. By employing drug deliv-
ery systems, the amount of administrations could be reduced for similar or even
greater results in the realm of Huntington’s treatment.

4.4 Amyotrophic Lateral Sclerosis & Multiple Sclerosis


In addition to Alzheimer’s Disease, Parkinson’s Disease, and Huntington’s Dis-
ease, quercetin has been researched for its potential in treating other neurode-
generative disorders, including Amyotrophic Lateral Sclerosis (ALS) and Mul-
tiple Sclerosis (MS). ALS, also known as Lou Gehrig’s Disease, is characterized
by the degeneration of motor neurons throughout both the central and pe-
ripheral nervous systems, which can lead to muscle atrophy and overall bodily
weakness [HACC+ 17]. The pathogenesis of ALS roots from a various factors:
oxidative stress, neuroinflammation, mitochondrial dysfunction, metal ion ac-

15

92
cumulation due to excitotoxicity, and even gene mutations that can lead to
misfolded protein aggregations [TRM+ 15] [TSA18]. On the other hand, MS
is characterized by the degeneration of the neuronal myelin sheath, which is
responsible for protecting electrical impulses during transmission within the
central nervous system and, as a result of slowed transmission, can lead to
difficulties in simple movements and coordination [LH11]. The pathogenesis
of MS is primarily autoimmune, degeneration mistakenly inflicted by the im-
mune system, but oxidative stress and neuroinflammation can play a role in
furthering the progression of the disorder [FBDB+ 09]. In attempts to treat
ALS and MS, quercetin has been employed for its free-radical scavenging, anti-
inflammatory, metal ion chelating, and neuroprotective properties. Bhatia et
al. administered quercetin in vitro and measured significant inhibitory effects
of SOD1 fibril aggregations with increasing concentrations of quercetin over a
30-hour period. Inhibitory effects were measured visually using TEM imagery
and numerically by ThT Fluoresence [BMSD20]. As SOD1 fibril aggregation
can be a genetic factor in the pathogenesis of ALS, the shown inhibitory effects
can be useful in treating the disorder. Hendriks et al. administered quercetin
in vitro to isolated myelin taken from the brain tissue of adult mice and let
RAW 264.7 cells phagocytose the myelin for 90 minutes before adding dihy-
drorhodamine 123 (DHR). As DHR can be used for the detection of reactive
oxygen species (ROS) formation, researchers were able to measure that the
myelin treated with quercetin showed significant reductions in reactive oxy-
gen species production during myelin phagocytosis compared to their untreated
counterparts [HDVVDP+ 03]. However, while there have been in vitro studies
for quercetin-based treatments of ALS and MS, few in vivo studies have been
executed potentially due to bioavailability struggles and underwhelming results.
While quercetin’s potential for ALS and MS treatment is evident, the use of drug
delivery systems could enhance quercetin’s bioavailability and potential for ALS
and MS treatments in vivo.

5 Discussion
Quercetin, a dietary flavonoid, has been heavily researched for its potential
in the treatment of neurodegenerative disorders due to its antioxidative, anti-
inflammatory, metal-ion chelating, and neuroprotective properties. However,
quercetin struggles in bioavailability due to the flavonoid’s lack of aqueous solu-
bility, poor chemical stability, and an unfavorable absorption profile. As a result,
free quercetin struggles to pass through the blood-brain barrier and reach neu-
rons and other biomolecules, preventing the flavonoid from having any effect
on the progression of neurodegenerative disorders. However, the rise of tar-
geted and controlled-release drug delivery applications provides a remedy for
the struggles quercetin faces. Through the use of polymer-based nanoparti-
cles, lipid-based nanoparticles, and metallic nanoparticles, among other drug
delivery systems, the bioavailability struggles of quercetin are resolved, allow-
ing the flavonoid to reach and have an impact on neuronal degradation within

16

93
the nervous system. Additionally, the targeting properties of drug carriers, in
concert with their controlled release manipulability, further enhance quercetin’s
abilities, ultimately allowing for more efficient treatments of neurodegenerative
disorders.
The two most significant benefits that quercetin gains from drug delivery
encapsulations are improved bioavailability and controlled release properties.
Modern research has focused primarily on polymer-based nanoparticles, includ-
ing PLA and PLGA nanoparticles, as well as chitosans for quercetin’s neurode-
generative applications. However, while polymer-based nanoparticles, particu-
larly PLA and PLGA, can be easily engineered chemically by attaching ligands
and other biomolecules for targeted release quercetin delivery, their hydropho-
bicity leads to challenges with surface interactions in aqueous environments,
resulting in difficulties getting the nanoparticle to release the drug at the de-
sired rate. While notably, PLGA can be engineered specifically to increase
the nanoparticle’s hydrophilicity by increasing the ratio of glycolyl groups to
lactyl groups, the polymeric nanoparticle still falls short of the level of surface
interactions that lipid-based nanoparticles possess. On the other hand, lipo-
somes and micelles, which only began to rise in popularity recently after they
were implemented in the creation of COVID-19 vaccinations, have shown more
promise in quercetin delivery for the treatment of neurodegenerative disorders.
Despite prior research being much more limited compared to polymer-based
nanoparticles, lipid-based nanoparticles are the better candidate due to their
hydrophilic surfaces in addition to their surface engineerability through the
attachment of biomolecules for targeting applications. Though they struggle
to solve quercetin’s chemical instability, the problem can easily be remedied
by implementing PEGylations, which, in turn, can also further enhance the
nanoparticle’s bioavailability and controlled release applications. Furthermore,
the use of PEGylations can also improve the likelihood of lipid-based nanoparti-
cles crossing the blood-brain barrier—an essential step in facilitating quercetin’s
interactions with neurons and other biomolecules. Ultimately, future research
should investigate the use of PEGylated nanoparticles, particularly for micelles
and liposomes, to deliver quercetin. Additionally, while the drug delivery of
quercetin for the treatment of neurodegenerative disorders is still a relatively
new concept with a very limited research base, especially in vivo, the topic has
shown tremendous potential and will certainly become an emerging field in the
coming years.

References
[AAA+ 18] L. A. Abdulkhaleq, M. A. Assi, Rasedee Abdullah, M. Zamri-
Saad, Y. H. Taufiq-Yap, and M. N. M. Hezmee. The crucial
roles of inflammatory mediators in inflammation: A review.
Veterinary World, 2018.

[AAAH86] Nouf K. Alaqeel, Mona H. AlSheikh, and Mohammed T. Al-

17

94
Hariri. Quercetin nanoemulsion ameliorates neuronal dysfunc-
tion in experimental alzheimer’s disease model. Antioxidants,
1986.

[AABM+ 19] Marjan Abri Aghdam, Roya Bagheri, Jafar Mosafer, Be-
hzad Baradaran, Mahmound Hashemzaei, Amir Baghban-
zadeh, Miguel de la Guardia, and Ahad Mokhtarzadeh. Re-
cent advances on thermosensitive and ph-sensitive liposomes
employed in controlled release. Journal of Controlled Release,
2019.
[ABF+ 00] Tajrena Alexi, Cesario V. Borlongan, Richard L. M. Faull,
Chris E. Williams, Ross G. Clark, Peter D. Gluckman, and
Paul E. Hughes. Neuroprotective strategies for basal ganglia
degeneration: Parkinson’s and huntington’s diseases. Progress
in Neurobiology, 2000.

[ADAP16] Alexander Victor Anand David, Radhakrishnan Arulmoli, and


Subramani Parasuraman. Overviews of biological importance
of quercetin: A bioactive flavonoid. Pharmacognosy Reviews,
2016.

[ADB+ 89] Igor B. Afanas’ev, Anatolii I. Dcrozhko, Aleksander V. Brod-


skii, Vladimir A. Kostyuk, and Alla I. Potapovitch. Chelating
and free radical scavenging mechanisms of inhibitory action of
rutin and quercetin in lipid peroxidation. Biochemical Pharma-
cology, 1989.
[AEG+ 23] Leila Ahmadi, Nahid Eskandari, Mustafa Ghanadian, Mahshid
Rahmati, Neda Kasiri, Masoud Etamadifar, Mohadeseh
Toghyani, and Fereshteh Alsahebfosoul. The immunomodu-
latory aspect of quercetin penta acetate on th17 cells prolifer-
ation and gene expression in multiple sclerosis. Cell Journal
(Yakhteh), 2023.

[Aka01] Takaaki Akaike. Role of free radicals in viral pathogenesis and


mutation. Reviews in Medical Virology, 2001.

[ARSD+ 13] Abolfazl Akbarzadeh, Rogaie Rezaei-Sadabady, Soodabeh


Davaran, Sang Woo Joo, Nosratollah Zarghami, Younes Han-
ifehpour, Mohammad Samiei, Mohammad Kouhi, and Kazem
Nejati-Koshki. Liposome: classification, preparation, and ap-
plications. Nanoscale Research Letters, 2013.

[Aru98] Okezie I. Aruoma. Free radicals, oxidative stress, and antiox-


idants in human health and disease. Journal of the American
Oil Chemists’ Society, 1998.

18

95
[AUID+ 07] Orhan Aktas, Oliver Ullrich, Carmen Infante-Duarte, Robert
Nitsch, and Frauke Zipp. Neuronal damage in brain inflamma-
tion. Archives of Neurology, 2007.

[BBW80] J. Baumann, F. V. Bruchhausen, and G. Wurm. Flavonoids


and related compounds as inhibitors of arachidonic acid perox-
idation. Prostaglandins, 1980.

[BGP+ 20] Hemanth Kumar Boyina, Sree Lakshmi Geethakhrishnan,


Swetha Panuganti, Kiran Gangarapu, Krishna Prasad De-
varakonda, Vasudha Bakshi, and Sandhya Rani Guggilla. In sil-
ico and in vivo studies on quercetin as potential anti-parkinson
agent. GeNeDis 2018, 2020.

[BK01] D. Allan Butterfield and Jaroslaw Kanski. Brain protein oxi-


dation in age-related neurodegenerative disorders that are as-
sociated with aggregated proteins. Mechanisms of Ageing and
Development, 2001.

[BL22] Al Borhan Bayazid and Beong Ou Lim. Quercetin is an active


agent in berries against neurodegenerative diseases progression
through modulation of nrf2/ho1. Nutrients, 2022.
[BMSD20] Nidhi K. Bhatia, Priya Modi, Shilpa Sharma, and Shashank
Deep. Quercetin and baicalein act as potent antiamyloidogenic
and fibril destabilizing agents for sod1 fibrils. ACS Chemical
Neuroscience, 2020.

[BMTC22] M. Boseti, A. Massè, E. Tobin, and M. Cannas. Silver coated


materials for external fixation devices: in vitro biocompatibility
and genotoxicity. Biomaterials, 2022.

[BSB+ 18] Ruma Baksi, Pratap Singh, Devendra, Swapnil P. Borse, Rita
Rana, Vipin Sharma, and Manish Nivsarkar. In vitro and in
vivo anticancer efficacy potential of quercetin loaded polymeric
nanoparticles. Biomedicine & Pharmacotherapy, 2018.

[BWU12] Lenoid Breydo, Jessica W. Wu, and Vladimir N. Uversky. α-


synuclein misfolding and parkinson’s disease. Biochimica et
Biophysica Acta (BBA) - Molecular Basis of Disease, 2012.

[CCE+ 03] Maria Careri, Claudio Corradini, Lisa Elviri, Isabella Nicoletti,
and Ingrid Zagnoni. Direct hplc analysis of quercetin and trans-
resveratrol in red wine, grape, and winemaking byproducts.
Journal of Agricultural and Food Chemistry, 2003.

[CFD+ 13] X. Cai, Z. Fang, J. Dou, A. Yu, and G. Zhai. Bioavailability of


quercetin: Problems and promises. Bentham Science Publish-
ers, 2013.

19

96
[Chi10] Salvatore Chirumbolo. The role of quercetin, flavonols and
flavones in modulating inflammatory cell function. Bentham
Science Publishers, 2010.

[CSD+ 13] Joy Chakraborty, Raghavendra Singh, Debashis Dutta, Amit


Naskar, Usha Rajamma, and Kochupurackal P. Mohanaku-
mar. Quercetin improves behavioral deficiencies, restores as-
trocytes and microglia, and reduces serotonin metabolism in 3-
nitropropionic acid-induced rat model of huntington’s disease.
CNS Neuroscience & Therapeutics, 2013.
[DCVDL+ 22] Francesca Di Cristo, Anna Valentino, Ilenia De Luca, Gi-
anfranco Peluso, Irene Bonadies, Anna Calarco, and Anna
Di Salle. Pla nanofibers for microenvironmental-responsive
quercetin release in local periodontal treatment. Molecules,
2022.

[DJ96] D. Dreher and A. F. Junod. Role of oxygen free radicals in


cancer development. European Journal of Cancer, 1996.

[DJ18] K. Divya and M. S. Jisha. Chitosan nanoparticles preparation


and applications. Environmental Chemistry Letters, 2018.

[DJM13] K. M. Denny Joseph and Muralidhara. Enhanced neuropro-


tective effect of fish oil in combination with quercetin against
3-nitropropionic acid induced oxidative stress in rat brain.
Progress in Neuro-Psychopharmacology and Biological Psychi-
atry, 2013.

[DMCD09] J. Mark Davis, E. Angela Murphy, Martin D. Carmichael, and


Ben Davis. Quercetin increases brain and muscle mitochon-
drial biogenesis and exercise tolerance. American Journal of
Physiology-Regulatory, Integrative and Comparative Physiology,
2009.
[DP03] William Dauer and Serge Przedborski. Parkinson’s disease:
Mechanisms and models. Neuron, 2003.
[EAPA+ 17] Aziz Eftekhari, Elham Ahmadian, Vahid Panahi-Azar, Hedayat
Hosseini, Mahnaz Tabibiazar, and Solmaz Maleki Dizaj. Hep-
atoprotective and free radical scavenging actions of quercetin
nanoparticles on aflatoxin b1-induced liver damage: in vitro/in
vivo studies. Artificial Cells, Nanomedicine, and Biotechnology,
2017.

[ENM+ 11] Ana M. Enciu, Mihnea I. Nicolescu, Catalin G. Manole,


Dafin F. Mureşanu, Laurenţiu M. Popescu, and Bogdan O.
Popescu. Neuroregeneration in neurodegenerative disorders.
BMC Neurology, 2011.

20

97
[FBDB+ 09] Josa M. Frischer, Stephan Bramow, Assunta Dal-Bianco, Clau-
dia F. Luncchinetti, Helmut Rauschka, Manfred Schmidbauer,
Henning Laursen, Per Soelberg Sorensen, and Hans Lassmann.
The relation between inflammation and neurodegeneration in
multiple sclerosis brains. Brain, 2009.

[FBS+ 07] Jeferson L. Franco, Hugo C. Braga, James Stringari, Fabina C.


Missau, Thais Posser, Beatriz G. Mendes, Rodrigo B. Leal,
Adair R. S. Santos, Alcir L. Dafre, Moacir G. Pizzolatti, and
Marcelo Farina. Mercurial-induced hydrogen peroxide gen-
eration in mouse brain mitochondria: Protective effects of
quercetin. Chemical Research in Toxicology, 2007.
[GBS+ 15] C. Gardi, K Bauerova, B. Stringa, V. Kuncirova, L. Slovak,
S. Ponist, F. Drafi, L. Bezakova, I. Tedesco, A. Acquaviva,
S. Bilotto, and G. L. Russo. Quercetin reduced inflamma-
tion and increased antioxidant defense in rat adjuvant arthritis.
Archives of Biochemistry and Biophysics, 2015.
[GLL13] Guang Hui Gao, Yi Li, and Doo Sung Lee. Environmental ph-
sensitive polymeric micelles for cancer diagnosis and targeted
therapy. Journal of Controlled Release, 2013.
[HACC+ 17] Orla Hardiman, Ammar Al-Chalabi, Adriano Chio, Emma M.
Corr, Giancarlo Logroscino, Wim Robberecht, Pamela J. Shaw,
Zachary Simmons, and Leonard H. van den Berg. Amyotrophic
lateral sclerosis. Nature Reviews Disease Primers, 2017.

[HCS+ 18] Zhi-Qiang Haung, Pan Chen, Wei-Wei Su, Yong-Gang Wang,
Hao Wu, Wei Peng, and Pei-Bo Li. Antioxidant activity and
hepatoprotective potential of quercetin 7-rhamnoside in vitro
and in vivo. Molecules, 2018.

[HDFY+ 17] Mahmoud Hashemzaei, Amin Delarami Far, Arezoo Yari, Her-
avi Heravi, Kaveh Tabrizian, Seyed Mohammad Taghdisi, Sar-
venaz Ekhtiari Sadegh, Konstantinos Tsarouhas, Dimitrios
Kouretas, George Tzanakakis, Dragana Nikitovic, Nikita Yure-
vich Anisimov, Demetrios A. Spandidos, Aristides M. Tsat-
sakis, and Ramin Rezaee. Anticancer and apoptosis-inducing
effects of quercetin in vitro and in vivo. Oncology Reports, 2017.

[HDVVDP+ 03] Jerome J. A. Hendriks, Helga E. De Vries, Susanne M. A. Van


Der Pol, Timo K. Van Den Berg, Eric A. F. Van Tol, and
Christine D. Dijstra. Flavonoids inhibit myelin phagocytosis
by macrophages; a structure–activity relationship study. Bio-
chemical Pharmacology, 2003.

21

98
[HZL+ 20] Yu Hai, Yuanxiao Zhang, Yingzhi Liang, Xiaoyu Ma, Xiao Qi,
Jianbo Xiao, Weiming Xue, Yane Luo, and Yianli Yue. Ad-
vance on the absorption, metabolism, and efficacy exertion of
quercetin and its important derivatives. Food Frontiers, 2020.

[IZS+ 20] Humaira Idrees, Syed Zohaib Javaid Zaidi, Aneela Sabir,
Rafi Ullah Khan, Xunli Zhang, and Sammer-ul Hassan. A re-
view of biodegradable natural polymer-based nanoparticles for
drug delivery applications. Nanomaterials, 2020.

[JRM+ 08] Tapan K. Jain, Maram K. Reddy, Marco A. Morales, Dian-


dra L. Leslie-Pelecky, and Vinod Labhasetwar. Biodistribution,
clearance, and biocompatibility of iron oxide magnetic nanopar-
ticles in rats. Molecular Pharmaceutics, 2008.

[KAAS16] Hena Khanam, Abad Ali, Mohd Asif, and Shamsuzzaman. Neu-
rodegenerative diseases linked to misfolded proteins and their
therapeutic approaches: A review. European Journal of Medic-
inal Chemistry, 2016.
[KG99] Rao Manjeet K and B. Ghosh. Quercetin inhibits lps-induced
nitric oxide and tumor necrosis factor-α production in murine
macrophages. International Journal of Immunopharmacology,
1999.
[KMP+ 13] S. S. Karuppagounder, S. K. Madathil, M. Pandey, R. Haobam,
U. Rajamma, and K. P. Mohanakumar. Quercetin up-
regulates mitochondrial complex-i activity to protect against
programmed cell death in rotenone model of parkinson’s dis-
ease in rats. Neuroscience, 2013.
[KMZM03] Do Kyung Kim, Maria Mikhaylova, Yu Zhang, and Mamoun
Muhammed. Protective coating of superparamagnetic iron ox-
ide nanoparticles. Chemistry of Materials, 2003.

[KTMC22] Kevser Kandemir, Merve Tomas, David Julian McClements,


and Esra Capanoglu. Recent advances on the improvement of
quercetin bioavailability. Trends in Food Science & Technology,
2022.

[KVM+ 11] Robert Kleemann, Lars Verschuren, Martine Morrison, Su-


sanne Zadelarr, Marjan J. Van Erk, Peter Y. Wielinga, and
Teake Kooistra. Anti-inflammatory, anti-proliferative and anti-
atherosclerotic effects of quercetin in human in vitro and in vivo
models. Atherosclerosis, 2011.

[KYSO07] Koji Kawata, Hiroyuki Yokoo, Ryuhei Shimazaki, and Satoshi


Okabe. Classification of heavy-metal toxicity by human dna mi-
croarray analysis. Environmental Science & Technology, 20007.

22

99
[LBBD17] Sonia Losada-Barreiro and Carlos Bravo-Dı́az. Free radicals
and polyphenols: The redox chemistry of neurodegenerative
diseases. European Journal of Medicinal Chemistry, 2017.

[LBS+ 18] Marija Lesjak, Ivana Beara, Nataša Simin, Diandra Pintać,
Tatjana Majkić, Kristina Bekvalac, Dejan Orčić, and Neda
Mimica-Dukić. Antioxidant and anti-inflammatory activities
of quercetin and its derivatives. Journal of Functional Foods,
2018.

[LCK16] Zhixun Luo, A. W. Jr. Castleman, and Shiv N. Khanna. Reac-


tivity of metal clusters. Chemical Reviews, 2016.

[LH11] Ingrid Loma and Rock Heyman. Multiple sclerosis: Pathogen-


esis and treatment. Current Neuropharmacology, 2011.

[LHS18] C. A. Lane, J. Hardy, and J. M. Schott. Alzheimer’s disease.


European Journal of Neurology, 2018.

[LJS+ 21] Mengyi Li, Shuai Jiang, Johanna Simon, Marie-Lusie Frey,
Manfred Wagner, Volker Mailänder, Daniel Crespy, and Katha-
rina Landfester. Brush conformation of polyethylene glycol de-
termines the stealth effect of nanocarriers in the low protein
adsorption regime. Nano Letters, 2021.
[LLL+ 17] Li Lv, Chunxia Liu, Zhengrong Li, Fangming Song, Guocheng
Li, and Xingzhen Huang. Pharmacokinetics of quercetin-loaded
methoxy poly(ethylene glycol)-b-poly(l-lactic acid) micelle after
oral administration in rats. BioMed Research International,
2017.
[LS13] Jingyan Li and Cristina Sabliov. Pla/plga nanoparticles for
delivery of drugs across the blood-brain barrier. Nanotechnology
Reviews, 2013.

[LW22] Wing-Fu Lai and Wing-Tak Wong. Design and optimization


of quercetin-based functional foods. Critical Reviews in Food
Science and Nutrition, 2022.

[LZJS17] Ting Liu, Lingyun Zhang, Donghyun Joo, and Shao-Cong Sun.
Nf-κb signaling in inflammation. Signal Transduction and Tar-
geted Therapy, 2017.

[MCM+ 98] Christine Morand, Vanessa Crespy, Claudine Manach, Cather-


ine Besson, Christian Demigné, and Christian Rémésy. Plasma
metabolites of quercetin and their antioxidant properties.
American Journal of Physiology-Regulatory, Integrative and
Comparative Physiology, 1998.

23

100
[Men79] Fredric M. Menger. The structure of micelles. Accounts of
Chemical Research, 1979.

[MKMH15] William G. McMaster, Annet Kirabo, Meena S. Madhur, and


David G. Harrison. Inflammation, immunity, and hypertensive
end-organ damage. Circulation Research, 2015.

[ML97] Simon R. J. Maxwell and Gregory Y. H. Lip. Free radicals


and antioxidants in cardiovascular disease. British Journal of
Clinical Pharmacology, 1997.

[MMD+ 97] Claudine Manach, Christine Morand, Christian Demigné, Odile


Texier, Françoise Régérat, and Rémésy Christian. Bioavailabil-
ity of rutin and quercetin in rats. FEBS Letters, 1997.

[MMM+ 18] Piyasi Mukhopadhyay, Subhajit Maity, Sudipto Mandal, Ab-


hay Sankar Chakraborti, A. K. Prajapati, and P. P. Kundu.
Preparation, characterization and in vivo evaluation of ph sen-
sitive, safe quercetin-succinylated chitosan-alginate core-shell-
corona nanoparticle for diabetes treatment. Carbohydrate Poly-
mers, 2018.

[MPSS+ 17] Lina Clara Gayoso e Ibiapina Moreno, Elena Puerta, José Ed-
uardo Suárez-Santiago, Nereide Stela Santos-Magalhães,
Maria J. Ramirez, and Juan M. Irache. Effect of the oral ad-
ministration of nanoencapsulated quercetin on a mouse model
of alzheimer’s disease. International Journal of Pharmaceutics,
2017.

[MS19] Rubin Thapa Magar and Jae Kyung Sohng. A review on struc-
ture, modifications and structure-activity relation of quercetin
and its derivatives. Korean Society for Microbiology and
Biotechnology, 2019.

[MSWI03] A. C. Maritim, R. A. Sanders, and J. B Watkins III. Diabetes,


oxidative stress, and antioxidants: A review. Journal of Bio-
chemical and Molecular Toxicology, 2003.

[N/A16] N/A. Global, regional, and national incidence, prevalence,


and years lived with disability for 310 diseases and injuries,
1990–2015: a systematic analysis for the global burden of dis-
ease study 2015. Lancet, 2016.

[NKE+ 18] Rezvan Enteshari Najafabadi, Nasrin Kazemipour, Abolghasem


Esmaeili, Siamak Beheshti, and Saeed Nazifi. Using superpara-
magnetic iron oxide nanoparticles to enhance bioavailability of
quercetin in the intact rat brain. BMC Pharmacology & Toxi-
cology, 2018.

24

101
[PGL+ 20] R. G. R. Pinheiro, A. Granhja, J. A. Loureiro, M. C. Pereira,
M. Pinheiro, A. R. Neves, and S. Reis. Quercetin lipid nanopar-
ticles functionalized with transferrin for alzheimer’s disease. Eu-
ropean Journal of Pharmaceutical Sciences, 2020.

[PHH99] Satu S. Pekkarinen, I. Marina Heinonen, and Anu I. Hopia.


Flavonoids quercetin, myricetin, kaemferol and (+)-catechin as
antioxidants in methyl linoleate. Journal of the Science of Food
and Agriculture, 1999.

[PKVP10] J. Pourova, M. Kottova, M. Voprsalova, and M. Pour. Reactive


oxygen and nitrogen species in normal physiological processes.
Acta Physiologica, 2010.

[PQF+ 12] Hector Pool, David Quintanar, Juan De Dios Figueroa, Camila
Marinho Mano, J. Etelvino H. Bechara, Luis A. Godı́nez, and
Sandra Mendoza. Antioxidant effects of quercetin and catechin
encapsulated into plga nanoparticles. Journal of Nanomateri-
als, 2012.
[PRY+ 19] Sonia Podvin, Holly T. Reardon, Katrina Yin, Charles Mosier,
and Vivian Hook. Multiple clinical features of huntington’s
disease correlate with mutant htt gene cag repeat lengths and
neurodegeneration. Journal of Neurology, 2019.

[PTK+ 20] Gopal Patel, Neeraj Singh Thakur, Varun Kushwah, Mahesh D.
Patil, Shivraj Hariram Nile, Sanyog Jain, Uttam Chand Baner-
jee, and Guoyin Kai. Liposomal delivery of mycophenolic acid
with quercetin for improved breast cancer therapy in sd rats.
Frontiers in Bioengineering and Biotechnology, 2020.
[PVK+ 22] Paraskevi Papakyriakopoulou, Nikolaos Velidakis, Elina Khat-
tab, Georgia Valsami, Ioannis Korakianitis, and Nikolaos Pe
Kadoglou. Potential pharmaceutical applications of quercetin
in cardiovascular diseases. Pharmaceuticals, 2022.

[PWD+ 04] Jayanth Panyam, Deborah Williams, Alekha Dash, Diandra


Leslie-Pelecky, and Vinod Labhasetwar. Solid-state solubility
influences encapsulation and release of hydrophobic drugs from
plga/pla nanoparticles. Journal of Pharmaceutical Sciences,
2004.

[PWS+ 08] Aroonsri Priprem, Jintanaporn Watanatorn, Saengrawee Sut-


thiparinyanont, Wathita Phachonpai, and Supaporn Muchima-
pura. Anxiety and cognitive effects of quercetin liposomes in
rats. Nanomedicine: Nanotechnology, Biology and Medicine,
2008.

25

102
[QGY+ 22] Xueju Qi, Cong Gao, Chuanjin Yin, Junting Fan, Xiaochen
Wu, Guohu Di, Jing Wang, and Chuanlong Guo. Development
of quercetin-loaded pvcl–pva–peg micelles and application in
inhibiting tumor angiogenesis through the pi3k/akt/vegf path-
way. Toxicology and Applied Pharmacology, 2022.

[QZW+ 16] Waseem Akhtar Qureshi, Ruifang Zhao, Hai Wang, Yanping
Ding, Ayesha Ihsan, Ayeesha Mujeeb, Guangjun Nie, and
Yuliang Zhao. Co-delivery of doxorubicin and quercetin via
mpeg–plga copolymer assembly for synergistic anti-tumor effi-
cacy and reducing cardio-toxicity. Science Bulletin, 2016.

[RBS+ 00] Gibanananda Ray, Sanjay Batra, Nootan Kumar Shukla,


Deo Suryanarayan, Vinod Raina, Seetharaman Ashok, and
Syed Akhtar Husain. Lipid peroxidation, free radical produc-
tion and antioxidant status in breast cancer. Breast Cancer
Research and Treatment, 2000.

[Ril94] P. A. Riley. Free radicals in biology: Oxidative stress and the


effects of ionizing radiation. International Journal of Radiation
Biology, 1994.
[RMS+ 20] Rehab Ahmed Rifaai, Sahar Ahmed Mokhemer, Entesar Ali
Saber, Seham A Abd El-Aleem, and Nashwa Fathy Gamal El-
Tahway. Neuroprotective effect of quercetin nanoparticles: A
possible prophylactic and therapeutic role in alzheimer’s dis-
ease. Journal of Chemical Neuroanatomy, 2020.
[RRD14] R. Ravichandran, M. Rajendran, and D. Devapiriam. Antiox-
idant study of quercetin and their metal complex and deter-
mination of stability constant by spectrophotometry method.
Food Chemistry, 2014.

[SA07] Bushra Sultana and Farooq Anwar. Flavonols (kaempeferol,


quercetin, myricetin) contents of selected fruits, vegetables and
medicinal plants. Food Chemistry, 2007.

[SJC+ 21] Stephen Safe, Arul Jayaraman, Robert S. Chapkin, Mar-


cell Howard, Kumaravel Mohankumar, and Rupesh Shrestha.
Flavonoids: structure–function and mechanisms of action and
opportunities for drug development. Toxicological Research,
2021.

[SLL+ 21] Yumei Sun, Chang Li, Zhonghua Li, Aishao Shangguan, Jinhe
Jiang, Wei Zheng, Shujun Zhang, and Qigai He. Quercetin as
an antiviral agent inhibits the pseudorabies virus in vitro and
in vivo. Virus Research, 2021.

26

103
[SLZ+ 16] Dongdong Sun, Nuan Li, Weiwei Zhang, Zhiwei Zhao, Zhipeng
Mou, Donghui Huang, Jie Liu, and Weiyun Wang. Design of
plga-functionalized quercetin nanoparticles for potential use in
alzheimer’s disease. Colloids and Surfaces B: Biointerfaces,
2016.

[SM13] Rajat Sandhir and Arpit Mehrotra. Quercetin supplementa-


tion is effective in improving mitochondrial dysfunctions in-
duced by 3-nitropropionic acid: Implications in huntington’s
disease. Biochimica et Biophysica Acta (BBA) - Molecular Ba-
sis of Disease, 2013.

[SNS+ 13] Ivan M. Savic, Vesna D. Nikolic, Ivana M. Savic, Ljubisa B.


Nikolic, and Mihajlo Z. Stankovic. Development and validation
of a new rp-hplc method for determination of quercetin in green
tea. Journal of Analytical Chemistry, 2013.

[SS11] Philip Seeman and Neil Seeman. Alzheimer’s disease: β-


amyloid plaque formation in human brain. Synapse, 2011.

[Sta90] Earl R. Stadtman. Metal ion-catalyzed oxidation of proteins:


Biochemical mechanism and biological consequences. Free Rad-
ical Biology and Medicine, 1990.

[SVKP18] Prabhu Srinivasan, S. Vijayakumar, Swaminathan Kothandara-


man, and Manogar Palani. Anti-diabetic activity of quercetin
extracted from phyllanthus emblica l. fruit: In silico and in vivo
approaches. Journal of Pharmaceutical Analysis, 2018.

[SWK08] Ioannis A. Sogias, Adrian C. Williams, and Vitaliy V. Khuto-


ryanskiy. Why is chitosan mucoadhesive? Biomacromolecules,
2008.
[SWM+ 12] Napatr Sriraksa, Jintanaporn Wattanathorn, Supaporn
Muchimapura, Somsak Tiamkao, Kamoltip Brown, and Kowit
Chaisiwamongkol. Cognitive-enhancing effect of quercetin
in a rat model of parkinson’s disease induced by 6-
hydroxydopamine. Evidence-Based Complementary and Alter-
native Medicine, 2012.

[TRM+ 15] Francesco Tafuri, Dario Ronchi, Francesca Magri, Giacomo P.


Comi, and Stefania Corti. Sod1 misplacing and mitochondrial
dysfunction in amyotrophic lateral sclerosis pathogenesis. Fron-
tiers in Cellular Neuroscience, 2015.
[TSA18] Jason R. Thonhoff, Ericka P. Simpson, and Stanley H. Appel.
Neuroinflammatory mechanisms in amyotrophic lateral sclero-
sis pathogenesis. Current Opinion in Neurology, 2018.

27

104
[VHS+ 11] Fabiana T. M. C. Vicentini, Tianyuan He, Yuan Shao, Maria
J. V. Fonesca, Waldiceu A. Verri, Gary J. Fisher, and Yiru
Xu. Quercetin inhibits uv irradiation-induced inflammatory cy-
tokine production in primary human keratinocytes by suppress-
ing nf-κb pathway. Journal of Dermatological Science, 2011.

[VLA+ 18] Konstantin P. Volcho, Sergey S. Laev, Ghulam M. Ashraf,


Gjumrakch Aliev, and Nariman F. Salakhutdinov. Application
of monoterpenoids and their derivatives for treatment of neu-
rodegenerative disorders. Bentham Science Publishers, 2018.

[VM19] Manjula Vinayak and Akhilendra K. Maurya. Quercetin loaded


nanoparticles in targeting cancer: Recent development. Anti-
Cancer Agents in Medicinal Chemistry, 2019.

[VRM+ 06] M. Valko, C. J. Rhodes, J. Moncolo, M. Izakovic, and


M. Mazur. Free radicals, metals and antioxidants in oxidative
stress-induced cancer. Chemico-Biological Interactions, 2006.
[WBM21] Susanne Wegmann, Jacek Biernat, and Eckhard Mandelkow.
A current view on tau protein phosphorylation in alzheimer’s
disease. Current Opinion in Neurobiology, 2021.

[WCZZ09] Hua Wei, Si-Xue Cheng, Xian-Zheng Zhang, and Ren-Xi


Zhuo. Thermo-sensitive polymeric micelles based on poly(n-
isopropylacrylamide) as drug carriers. Progress in Polymer Sci-
ence, 2009.

[WHH+ 21] Wen-Wen Wang, Ruiyu Han, Hai-Jun He, Jia Li, Si-Yan Chen,
Yingying Gu, and Chenglong Xie. Administration of quercetin
improves mitochondria quality control and protects the neurons
in 6-ohda-lesioned parkinson’s disease models. Aging, 2021.

[WLL+ 11] Kui Wang, Rui Liu, Jingyi Li, Jiali Mao, Yunlong Lei, Jin-
hua Wu, Jun Zeng, Tao Zhang, Hong Wu, Lijuan Chen, Can-
hua Huang, and Yuquan Wei. Quercetin induces protective
autophagy in gastric cancer cells: Involvement of akt-mtor-
and hypoxia-induced factor 1α-mediated signaling. Autophagy,
2011.

[WSM+ 16] Weiyou Wang, Cuixia Sun, Like Mao, Peihua Ma, Fuguo Liu,
Jie Yang, and Yanxiang Gao. The biological activities, chemi-
cal stability, metabolism and delivery systems of quercetin: A
review. Trends in Food Science & Technology, 2016.
[WSRE04] Robert J. Williams, Jeremy P. E. Spencer, and Catherine Rice-
Evans. Flavonoids: antioxidants or signalling molecules? Free
Radical Biology and Medicine, 2004.

28

105
[WYZ+ 18] Shengan Wang, Jiaying Yao, Bo Zhou, Maria T. Chaudry,
Mi Wang, Fenglin Xiao, Yao Li, and Wenzhe Yin. Bacterio-
static effect of quercetin as an antibiotic alternative in vivo and
its antibacterial mechanism in vitro. Journal of Food Protec-
tion, 2018.

[XWG+ 16] Li Xiang, Handong Wang, Yongyue Gao, Liwen Li, Chao Tang,
Guodao Wen, Youqing Yang, Zong Zhuang, Mengliang Zhou,
Lei Mao, and Youwu Fan. Quercetin induces mitochondrial
biogenesis in experimental traumatic brain injury via the pgc-
1α signaling pathway. American Journal of Translational Re-
search, 2016.
[ZLA+ 11] Feng Zhang, Emma Lees, Faheem Amin, Pilar Rivera Gil, Fang
Yang, Paul Mulvaney, and Wolfgang J. Parak. Polymer-coated
nanoparticles: A universal tool for biolabelling experiments.
Small, 2011.

[ZZM21] Hualu Zhou, Bingjing Zheng, and David Julian McClements.


In vitro gastrointestinal stability of lipophilic polyphenols is
dependent on their oil–water partitioning in emulsions: Studies
on curcumin, resveratrol, and quercetin. Journal of Agricultural
and Food Chemistry, 2021.

29

106
Art Therapy’s Effectiveness and its Role in
Treating Neurological Conditions

Simryn Patel
October 14, 2023

Abstract
This paper explores the extensive psychological and neurological effects
of art therapy. Additionally, it offers art therapy as a form of treatment
for individuals suffering from PTSD and Alzheimer’s disease. Both of
these conditions pose unique dangers; PTSD patients suffer from psycho-
logical trauma and Alzheimer’s patients suffer from degenerative activity
in the brain. A major factor that reinforces art therapy’s credibility is its
ability to express nonverbal memories and emotions associated with them.
Additionally, certain parts of the brain during art therapy are activated
which can prove useful in PTSD and Alzheimer’s. This paper consid-
ers the mechanisms and processes of art therapy along with its effects
to emphasize its potential in treating symptoms and improving overall
well-being.

1 Introduction
After Frida Kahlo was in a full-body cast for three months due to a debilitating
bus accident, she resorted to art to pass the time and alleviate the pain. Once
she physically recovered, Kahlo completed many paintings that reflected her
traumatic experience. She said, “My painting carries with it the message of
pain” (Svoboda, 2022). Frida Kahlo is just one example of many other people
who have utilized the strength of art therapy in treating certain neurological
conditions and improving well-being.
Art therapy is a type of therapy that uses creative methods of expression
with the guidance of an art therapist. There are many types of art therapy,
but I will be focusing on visual arts therapy such as painting and drawing. Art
therapy has certain psychological effects on patients, such as strengthening a
mind-body connection and improving well-being. Art-making also activates the
hippocampus, amygdala, visual cortex, and prefrontal cortex during the creative
process which can help treat certain neurological conditions such as PTSD and
Alzheimer’s.
∗ Advised by: Dr. Ellen Robertson of the University of Cambridge

107
In this paper, I will refer to specific neurological conditions to demonstrate
the positive psychological and neurological effects of art therapy. Different types
of neurological conditions, ranging from strokes to Alzheimer’s disease, affect
up to one billion people worldwide. An estimated 6.8 million people die every
year as a result of these neurological disorders. I focus especially on art ther-
apy’s effect on PTSD and Alzheimer’s disease, but I make references to other
neurological conditions as well.
Overall, art therapy is an effective method used to combat neurological dis-
orders including PTSD and Alzheimer’s. What makes art therapy effective
are the psychological and neurological changes that take place in an individual
during art therapy. Psychological changes include the expression of emotions,
improved mood, and more. Neurological changes include changes to the neural
connections, the prefrontal cortex, and more. Overall, the combination of these
various psychological and neurological changes that occur justifies the success
of the emerging field of art therapy.

2 What is Art Therapy?


The use of art therapy dates back to the 1940s when Margaret Naumburg dis-
covered a connection between creativity and healing (Cuellar, n.d.). Since then,
art therapy has been a type of psychotherapy (treatment of conditions through
interaction and verbal communication) that uses various methods to encourage
self-expression and aid in clinical diagnosis (Art Therapy Definition - Google
Search, n.d.). At the end of the process, the goal of art therapy is to im-
prove self-esteem and self-expression. Self-esteem can lead to better social re-
lationships, success, and improved mental and physical health (Blouin, 2022).
Similarly, self-expression allows patients to understand themselves and process
emotions, leading to improved mood, behavior, and cognition (The Power of
Self-Expression, 2022).
Different mediums and environments can encourage artistic expression. Com-
mon media used with artistic expression include painting, drawing, photography,
and clay sculpting (Hu et al., 2021). Other forms of artistic expression, such as
music therapy, dance therapy, and drama therapy exist as well, but they will be
less emphasized throughout this paper. Once the patient chooses or is recom-
mended a medium, patients either have the choice to work one-on-one with an
art therapist, which provides a more intimate connection, or in a group envi-
ronment, which creates a community of people similar to one another. In either
environment, once the art-making process is complete, the artwork is verbally
interpreted by the patient to explain what they have depicted as their emotions
(with the support and guidance of their art therapist) (Longe, 2016).
There are many techniques that art therapists utilize when working with
a patient, each for different purposes. Examples of some techniques include
blind drawing, memory painting, emotion drawing, and self-portraits, which all
aim to express the patients’ abstract feelings that can be challenging to convert
into words (Hu et al., 2021). Similarly, “Bilateral Art”, is another technique

108
that integrates verbal and nonverbal processes by involving both left and right
hemisphere functions. This technique designed by McNamee (2004) is done by
using both hands in an effort to stimulate memories and experiences that are
contained in both sides of the brain (Talwar, 2007).

2.1 Art Therapy vs Art as Therapy


This paper will focus on art therapy, but it is important to distinguish art ther-
apy from art as therapy. Art therapy is a guided session with an art therapist,
and the main purpose of the session is to complete certain activities that help
to cope with specific situations or conditions. On the other hand, art as therapy
can be considered a relaxing, everyday activity that is inadvertently therapeu-
tic. In other words, the main goal of art as therapy is not to target or improve
certain feelings, but it is a stress-relieving activity that may create a by-product
of satisfying outcomes, similar to reading a book or taking a walk (Resources,
2018).

3 Psychological Effects of Art Therapy


In terms of psychology, art therapy strives to express, interpret, and examine
hidden feelings without relying on only verbal communication from the patient.
In one study, a patient diagnosed with dementia who could not communicate
through words could only effectively communicate through drawings, therefore
accessing his/her abstract cognition (Zaidel, 2005). In a different study, Dr.
Meekums examined women who were survivors of childhood sexual abuse who
were unable to articulate painful memories and experiences. The groups used
a variety of creative art therapies (CATs) and some verbal interventions, all of
which were facilitated by qualified art therapists. Dr. Meekums found that art
as a form of communication and expression provided an effective alternative to
normal conversation. For example, a patient stated, “‘I can express myself . .
. I couldn’t do it when I was a child when I was being hurt. . .’” (Meekums,
1999, p. 256).
Art as a way of communicating compared to verbal communication can also
provide easier access to emotions. According to Czamanski-Cohen & Weihs
(2016), emotions are gained through a process that combines interoceptive stim-
uli produced within the body or somatosensory stimuli perceived through touch,
pain, temperature, and other external factors. This information is then trans-
lated to give emotional meaning. In terms of art therapy, it is the engage-
ment with the physical art materials (somatosensory) along with the visual
imagery when viewing artwork (interoceptive) that can assist in revealing emo-
tions. When a patient touches or manipulates art materials, there are sensory
responses to pressure, vibration, and temperature. These responses due to sen-
sations in the joints, muscles, and skin while manipulating an object guide the
individual in understanding the object’s qualities - such as size, shape, and
weight (Czamanski-Cohen & Weihs, 2016). According to the Singer-Schachter

109
Two-factor Theory of Emotions, once a person feels or experiences physiologi-
cal arousal, they then interpret the arousal to label it as an emotion (Yarwood,
n.d.). From a more neurological standpoint, sensory information is received by
the somatosensory primary cortex in the cerebral cortex. It is then transferred to
the amygdala, where the information is processed into emotions. Therefore, the
sensory information received during art therapy can lead to better access and
acknowledgment of previously blocked emotions (Czamanski-Cohen & Weihs,
2016).
Images produced during art therapy not only retrieve but also improve emo-
tions and attitudes. According to Holmes, Mathews, Dalgleish, and Mackin-
tosh’s (2006) hypothesis, images have the power to increase ratings of emotions
and can have a more positive effect than verbal processing. To support this hy-
pothesis, the researchers conducted a study with participants in which they were
presented with numerous scenarios with initial ambiguity as to a positive out-
come or not. The participants were then asked to either imagine these events or
listen to the same descriptions while thinking about their meaning. The events
were in paragraph form, and the first half of the paragraph began with a sen-
tence that had a negative connotation. For example, a beginning may be “You
are at home alone watching TV. You were dozing and suddenly woke up under
the impression that you heard a frightening noise and then realize. . . ”. The
rest of the sentence is completed by both groups. Researchers found that the
participants in the imagery group reported more positive effects of the scenarios
and rated the descriptions as being more positive than their counterparts. For
example, a positive ending was “and then realize with relief that it was your
partner returning home.” This may have occurred due to participants in the
verbal condition focusing more on the negative components of the paragraphs
(Holmes et al., 2006). So, images produced during art therapy may affect and
result in positive attitudes and emotions.
Visual communication (patient expression through artwork and images) also
has a greater effect on the patient’s memories and helps retrieve them, as ev-
idenced by the picture superiority effect. According to Paivio’s study in 1973
which explored the effect of pictures vs words on memory, participants were
presented with pairs of words, pairs of pictures, or pairs of one word and one
picture. Then, participants were tested on their memory for the previously pre-
sented stimuli. Results showed that there was better memory recall for the pairs
with pictures than those with words alone. According to Paivio’s “dual coding”
theory, images hold more power than words because pictures generate a visual
and verbal response, whereas words are not as likely to generate images for the
participant (Paivio & Csapo, 1973). In any type of therapy, accessing memo-
ries is important to express previous experiences. As shown in Paivio’s study,
since pictures have proved useful for accessing memories, the pictures produced
during art therapy can facilitate a better expressive experience for the patient.
Another example of the picture superiority effect is when patients with
Alzheimer’s disease were unable to recall memories of loved ones when hear-
ing their names but were able to recognize them when presented with a picture
(Ally et al., 2009). In the same study, healthy adults (controls), patients diag-

110
Figure 1: Picture Superiority Effect Study

nosed with MCI (mild cognitive impairment that results in memory and think-
ing problems due to old age), and patients diagnosed with mild Alzheimer’s
disease were assessed to measure memory for pictures versus words. In each
case, pictures held a higher recognition accuracy, meaning that patients could
recall pictures more accurately when presented with stimuli. So, this confirms
the picture superiority effect (Ally et al., 2009). From these results, the use
of pictures as communication and treatment in the field of art therapy can be
crucial, especially in trauma and dementia patients due to the patient’s ability
to recall certain memories through art.
Although images produced during the art-making phase are important, com-
bining this with an examination of the artwork for any hidden emotions or
messages can strengthen a mind-body connection (Cuellar, 2007). The mind-
body connection is a concept that suggests that processes of the mind, such as
thinking and feeling, are rooted in one’s sensory and motor experiences. Art uti-
lizes this connection as art-making itself is sensory (it induces body sensations
and emotions), and then interpreting artwork requires thinking and emotions
(Czamanski-Cohen & Weihs, 2016). Numerous studies provide evidence for the
mind-body connection, one of them focusing on the reduction of cortisol levels
and its connection to participants’ responses following visual art making. The

Figure 2: Cortisol Levels Before and After Art Making

111
“mind” in this case is the response after the process, and the “body” is the cor-
tisol level of the participants. Cortisol is most commonly known as the body’s
stress hormone. In the study, 39 participants provided saliva samples to deter-
mine cortisol levels before and after making art. During the art-making process,
participants had artistic liberty and were not confined to a specific subject or
material. At the end of the session, participants provided written responses
about their experiences and provided another saliva sample. As shown by the
graph, average cortisol levels after art-making were significantly lower than be-
fore art-making. This result matched the participants’ written responses, as
most stated they felt relaxed, relieved, excited, and fulfilled. The mind-body
connection shown in this study confirms that art therapy affects attitude, well-
being, and emotions (mind) which can have an impact on health (body) and
vice-versa (Kaimal, Ray, & Muniz, 2016).
Additionally, colors play a large role in identifying the patient’s mood and
mental health, and they also transform the patient’s therapeutic experience.
In multiple studies, different people with varying emotional states interacted
with color in different ways. In one study, Wadeson (1971) noticed that people
diagnosed with depression used significantly less color in their paintings than
other patients. In another study involving a leukemic girl, the sick patient who
was not feeling well used much red and black, which indicated an overflow of
negative feelings. That patient died six months later (Cotton, 1985). From
these studies, color has the power to reflect the mental status of the patients.

Graham (1998) proposes a chemical explanation for a patient’s use of dark


and light colors based on melatonin and serotonin levels in the body. During the
day, when there is typically much sunlight and colors, the hypothalamus releases
the stimulant serotonin. On the other hand, at night when there are tones of
gray and black, the hypothalamus releases a depressant called melatonin to help
induce sleep. So, bright colors can be associated with liveliness and wakefulness,
while darker colors represent gloom and feelings of melancholy (Graham, 1998).
Color can also reveal personality types, as extroverts tend to gravitate towards
warm and bold colors such as red and orange. Extroverts find these colors highly
stimulating, while introverts prefer calmer and cooler colors such as blue and
green (Birren, 1980). Warmer colors as stimulation are shown in the study by
Emery (1929), where they studied patients in psychiatric wards. The patients

112
had previously been deprived of the color red (which was believed to induce
madness), but then they received a small red string. The results showed that
the patients became more animated and as a result increased their activity
and work output (Emery, 1929). From these results, the color that the patient
utilizes is not only a form of expression but can lead to changes in their mood
and behavior as well.
However, there can be many explanations for color depending on the patient.
In one study, a 31-year-old suicidal woman was asked to draw something and
provide an explanation of the meaning of the colors. For red specifically, she
stated that she felt strength, power, courage, and joy (Lev-Wiesel, n.d.). In
another study also examining the color red, an experiment was done to draw
connections between dominance (male dominance and testosterone levels) and
the color red. In this experiment, red was hypothesized to signify dominance,
and blue was hypothesized to signify relaxation. Participants were presented
with different words, some words’ meanings related to dominance, and some
words’ meanings related to tranquility. These different words were then changed
into a blue or red font, and participants were asked to classify these words as
dominance-related or rest-related while being timed. Results showed that par-
ticipants made fewer errors when categorizing red dominance-related words in
the dominance category rather than categorizing blue dominance-related words
in the dominance category (Mentzel et al., 2017). From this experiment as well
as the study of the 31-year-old woman, there is some overlap with the color red’s
meaning (as power, strength, and dominance), but there are individual expe-
riences that may lead to certain colors having different meanings for everyone
(shown by red’s meaning as courage and joy for the woman). These colors give
the patient the opportunity to express moods and emotions that they cannot
express verbally. Therefore, once an art piece is complete with certain colors,
and the patient feels an initial sense of self-expression, an explanation about
the whole artwork may come more easily. This explanation along with the use
of colors makes the art therapy session more unique for the patient, as they
are able to give their own, personal explanation for the colors based on their
individual experiences.

4 Neurological Effects of Art Therapy


There are many notable parts of the brain that are activated during art making,
the most well-known being the hippocampus, amygdala (mentioned previously),
visual cortex, and prefrontal cortex. The visual cortex, as its name suggests,
receives and processes visual information to send to other regions of the brain
to be analyzed, and will not require additional explanation in this paper. Also,
there are many other parts of the brain that relate to the physical touch aspect
of art materials, but discussing them is beyond the scope of this article. So,
I will be discussing the role of the hippocampus and prefrontal cortex in this
section.

113
4.1 The Hippocampus
Art therapy activates the hippocampus during the creative process. In a 2013
study, researchers examined the effect of hippocampal amnesia (due to lesions)
on creative thinking. This is relevant to art therapy’s effect on the hippocampus
because art therapy requires a great deal of creative thinking. In the study,
the participants (those with normal and damaged hippocampi) completed the
Torrance Tests of Creative Thinking (TTCT). In these tests, they were required
to complete both the verbal and figural parts of the experiment that tested
creative thinking. In the verbal form, they were given many prompts that
forced them to creatively problem-solve. Prompts included “Generate ways to
improve a toy so that it is more fun to play with,” “Generate alternative uses
for a common object (ex) cardboard box),” and “Generate hypotheses about
potential benefits or problems related to an improbable situation (ex) if clouds
had strings attached to them).” Next, in the figural form, participants were to
complete a drawing when given one that was incomplete. Examples include
ten incomplete line contours and 30 repeated parallel line segments. Once they
completed their work, the participants were asked to give their artwork a unique
title. After both sections were complete, the researchers scored and examined
each participant’s answers. Scores were dependent upon the fluency of their
answers and originality. In the verbal section, participants with hippocampal
amnesia (the study refers to them as the AM group) scored significantly lower
than their healthy counterparts. For example, when asked to think of creative
uses for cardboard boxes, one healthy participant came up with 26 uses, 23 of
which were unique (e.g. Building a suit of armor). On the other hand, one
amnesic participant came up with only 2 uses which were recycling the boxes
and making a fort (Duff et al., 2013).

Figure 3: Creativity scores for people with hippocampal amnesia vs controls

In the figural section, the healthy participants also scored higher than the
AM group. The prompt was to create an image that includes the shape of the
large black oval and add new ideas surrounding it to make the picture tell an
exciting story. For example, when given an incomplete drawing of a large black
oval, one healthy participant turned it into a drawing of a golf course complete

114
Figure 4: Drawing prompts

with signs for parking, a clubhouse, Tiger Woods, and more. Another healthy
participant turned the oval into a hot air balloon that takes people for rides
above the city. On the other hand, a participant from the AM group turned the
oval into a bug. Another participant from the AM group used the shape as an
egg and drew a chicken above it (Duff et al., 2013). This study showed that the
hippocampus is extremely important in the creative process, which means that
it can be activated during creativity in art therapy.
Another study that demonstrates the activation of the hippocampus in art
therapy is when King & Kaimal (2019) measured brain activity through elec-
troencephalography (EEG), a non-invasive method that allows for free move-
ment. In one study involving EEG, patients worked with clay and drawing, and
there was activation in brain regions involved in memory processing and med-
itative states (including the hippocampus) (King & Kaimal, 2019). Although
there is little evidence for direct causation, it is possible that there may be some
correlational relationship between the activation of the hippocampus and easier
access to memories and emotions during the creative process.

4.2 The Prefrontal Cortex


The prefrontal cortex is highly activated during art therapy. The prefrontal
cortex is important because it regulates our thoughts, emotions, and actions
through connections with other regions, such as the regions stated above (Arn-
sten, 2009). Bogousslavsky (2005) argues that the brain’s frontal anterior sub-
cortical loops are activated during art-making; in other words, there is increased

115
neural activity in the frontal lobe during the execution of artwork (Talwar,
2007). Similarly, in an experiment involving patients coloring, doodling, and
free-drawing, fNIRS scans (functional near-infrared spectroscopy that measures
brain activity) showed significant activation of the medial prefrontal cortex
(Kaimal et al., 2017). In another study by Zeki (2011), participants under-
went brain scans while being shown images of paintings. When participants
viewed paintings that they deemed beautiful, fMRI scans showed that blood
flow increased by almost 10 percent to the medial orbitofrontal cortex region
of the brain, a part of the prefrontal cortex associated with pleasure. The in-
creased amount of blood flow to the medial orbitofrontal cortex is similar when
looking at a loved one (ACRM, 2020). This activation of the prefrontal cortex
is essential in stimulating the reward center of the brain which contributes to
feelings of accomplishment (Chau et al., 2018). In patients with neurological
conditions, these feelings of accomplishment and purpose due to the activation
of the prefrontal cortex prove vital in recovery.
The activation of the prefrontal cortex (PFC) during art therapy also con-
tributes to the lateralization and stimulation of both hemispheres of the brain.
Although the PFC itself does not directly make the connections, the corpus cal-
losum is a bundle of nerve fibers that facilitates communication between both
hemispheres of the brain by connecting the two separate prefrontal cortices (Cor-
pus Callosum - an Overview — ScienceDirect Topics, n.d.). The left hemisphere
is associated with language, speech, analytical thinking, and sequential process-
ing. The right hemisphere is associated with visual motor activities, intuition,
emotions, and sensory skills. The integration of both hemispheres of the brain is
essential for different cognitive processes including attention, decision-making,
and emotional regulation. For neurological conditions (such as PTSD), these
cognitive processes from lateralization can prove useful if they are strengthened
through art therapy, as will be discussed later. Art therapy can promote bilat-
eral stimulation through a technique in which the patient utilizes both dominant
and non-dominant hands in the art-making process. This technique works in
lateralization because the right hand is controlled by the left hemisphere of the
brain, and vice versa (Malchiodi, 2003).
Both hemispheres can also be stimulated by the two-part process of art
therapy. The first part is the art-making and the second part is the explanation
of the artwork. The left hemisphere allows for an explanation of the image
produced by (mostly) the right hemisphere from the first step (Talwar, 2007).
As stated above, the connection between both hemispheres in art therapy is
important for attention, decision-making, and emotional regulation. When this
connection is not present, there can be dire consequences for a patient who is
already struggling with a neurological condition.
The negative consequences of having no connection between the right and
left hemispheres of the brain can be seen, in one case, from the agenesis of the
corpus callosum (AgCC). This disorder presents itself at birth when the tissue
that connects the left and right sides of the brain is partially or completely
missing. The purpose of highlighting this extreme example is to show the ef-
fects of having no connection between both hemispheres in the brain to help

10

116
demonstrate why lateralization is important. In a study conducted by Labadi
& Beke (2017), participants included 18 children between the ages of 6 and 8
with agenesis of the corpus callosum and 18 typically developing children who
were matched by IQ, age, gender, and education. Labadi & Beke examined both
groups’ emotional and mental state recognition with a process that is called the
“Faces Test”. Each child was shown 20 photographs of an actress posing: 10
photos of basic emotions and 10 photos of complex mental states. Under each
photo, two words were typed, but only one described the emotion or mental
state the actress was depicting. The experimenter read the two words, and the
child was asked to choose the words that best represented the actress’s emotion
or mental state in the picture. If they were correct, they got one point, and
if they were incorrect they received no points. After the experiment, the re-
searchers found that children with AgCC were less accurate and showed overall
poorer performance in observing emotional states than the control group. As
shown, the absence of the corpus callosum and any connection between both
hemispheres of the brain limits their understanding of complex social cognitive
functions (Lábadi & Beke, 2017).
However, when art therapy connects both hemispheres, there can be better
social awareness and behavior as a result (Lábadi & Beke, 2017). Improved
social relations through art therapy can not only help many neurobehavioral
conditions (including autism, ADHD, and obsessive-compulsive disorder) but
also give a sense of social purpose when one’s surroundings are better under-
stood. Therefore, art therapy promotes connections between both hemispheres
of the brain that can improve the well-being of patients struggling with neuro-
logical conditions.

4.3 Neuroplasticity
Just as I discussed the connections involved with bilateral stimulation, new
neural connections, and pathways can be formed as well when completing art
(Konopka, 2014). Neuroplasticity is the brain’s ability to form and organize
neural networks, including after a learning experience or after an injury (Pud-
erbaugh & Emmady, 2023).
In a 2014 study, Belkofer, Vaughan Van Hecke & Konopka measured the
effects on the brain after 20 minutes of drawing. The study involved the use of
an EEG to investigate the differences in patterns of brain activity among artists
and non-artists. Results showed that for artists, there was strong activation in
the left posterior temporal, parietal, and occipital regions of the brain. For non-
artists, there was activation in the right parietal and right prefrontal areas of the
brain. The authors believed that the different areas activated between artists
and non-artists were due to the non-artists making new connections because of
learning (Belkofer et al., 2014). These new connections formed when completing
art can prove useful in Alzheimer’s patients, which will be explained more in-
depth later on.

11

117
5 Neurological conditions - PTSD
5.1 What is PTSD?
PTSD results from exposure to emotionally disturbing or life-threatening events.
As a result, there can be lasting effects on someone’s mental, physical, emo-
tional, and social well-being. Traumatic experiences include but are not limited
to physical abuse, poverty, childhood neglect, and racism (What Is Trauma?,
2018).

5.2 What happens to the body/brain during a traumatic


event?
5.2.1 Stress
During a traumatic event, stress levels increase dramatically. Cortisol is rapidly
produced, and if produced frequently enough, can disrupt developing brain cir-
cuits in young children (InBrief, n.d.). Even in adults, the continual production
of cortisol is not healthy and can lead to long-term effects such as impairments
in learning, memory, and the ability to regulate certain stress responses. Even
after one traumatic event with the responses stated above, PTSD can be diag-
nosed. (Stress Disrupts the Architecture of the Developing Brain, n.d.).

5.2.2 Prefrontal Cortex


The prefrontal cortex is also affected by PTSD. In a 2005 study, 13 patients
with PTSD and 13 without PTSD were shown images of expressions (happy,
fearful, and neutral). Using fMRI, the researchers focused on blood oxygena-
tion level-dependent (BOLD) signal responses. BOLD reflects changes in brain
blood flow and blood oxygenation which can help to identify specific neuronal
activities. The results, as shown in the graph, showed that BOLD signals de-
creased to the medial prefrontal cortex when the PTSD participants were shown
images with fear (MR stands for the fMRI image). However, the control group

Figure 5: Prefrontal Cortex fMRI signals

had heightened awareness and increased BOLD signals to the medial prefrontal

12

118
cortex when presented with a fearful image. For the same stimuli, BOLD sig-
nals to the amygdala were measured as well. Researchers found that PTSD
patients exhibited exaggerated amygdala responses (Shin, 2005). Essentially, in
PTSD, the amygdala (the survival center) goes into overdrive as if the patient
were experiencing that trauma for the first time. At the same time, the pre-
frontal cortex also becomes suppressed so there is less capability to control any
emotions, such as fear (How Does Trauma Affect the Brain?, n.d.).
In another study, J. Douglas Bremner (1999) looked at the blood flow of
Vietnam combat veterans when they were exposed to combat-related and neu-
tral pictures/sounds. Researchers used positron emission tomography (PET)
which uses radioactive substances to measure blood flow. In the study, there
were Vietnam combat veterans with PTSD (n=10) and Vietnam combat veter-
ans without PTSD (n=10). Individuals were shown neutral slides, winter scenes
with nonverbal music, and combat slides, actual violent photographs from Viet-
nam. Scans showed that when veterans with PTSD were exposed to traumatic
images, there was decreased blood flow in the medial prefrontal cortex. As men-
tioned in the previous paragraph, the hyperresponsivity of the amygdala results
in decreased function of the prefrontal cortex as a way to cope with trauma
(Bremner et al., 1999).
The prefrontal cortex works alongside the amygdala (as explained previously
with the Singer-Schachter theory of emotions) to process emotional stimuli. So,
since the prefrontal cortex is involved in memory, emotions, and social behavior,
PTSD patients have difficulties in these areas when recalling a traumatic event
(Kong et al., 2013).

5.2.3 Hippocampus

Figure 6: Effects on the Hippocampus

With PTSD, there are impairments and other effects on the hippocampus as
well. MRI (magnetic resonance imaging) was performed on male miners involved
in coal mine gas explosions. There were 14 with PTSD and 25 without. PTSD
patients showed a decreased gray matter volume in the hippocampus compared

13

119
to their counterparts as shown in the graph to the right. Impairments in the
hippocampus imply impairments in learning and memory (Zhang et al., 2014).

5.3 How can art therapy be useful in treating PTSD?


There are many ways that art therapy can benefit PTSD patients, and the
positive effects of art therapy mentioned earlier in this paper align with problems
caused by PTSD. For example, cortisol levels are lowered, the right and left
brain are integrated, and the prefrontal cortex, hippocampus, and amygdala
are activated (Czamanski-Cohen & Weihs, 2016; Duff et al., 2013; Kaimal et
al., 2017; Kaimal, Ray, & Muniz, 2016; Malchiodi, 2003). However, perhaps
more importantly, traumatic memories are stored nonverbally which can be
accessed through art therapy. I made this statement earlier in the paper, but
will now expand on this point. There are two ways that people deal with
trauma. Healthy individuals move through the normal stages of grief and loss,
while others shield their memories to seek emotional relief from the stress they
cause. The suppression of these memories in PTSD patients is what causes all
of the changes in the body and brain as mentioned before. So, a complication of
PTSD is that traumatic memories are obscured and likely cannot be expressed
verbally (Talwar, 2007).
Art therapy helps in this aspect because it helps access nonverbal memories
through communication in the artwork. In a 2007 study, clients underwent an
art therapy session and rated how they felt on a scale of 1-7 (7 being accepting
of their experiences) after doing the art. Client A was a 58-year-old woman

Figure 7: Art Therapy Patient’s Artwork

who worked in the field of mental health, and a recent experience with a client
reminded her of her own childhood neglect and rejection. She had attended
talk therapy previously but stated, “I need to work with the image; words
are not enough” (p. 31). At the end of the session, she drew a horse, which
represented freedom, strength, and wholeness. She rated how she felt as a 7
(Talwar, 2007). As shown, art therapy allows for self-expression which allows
for access to blocked traumatic memories.

14

120
6 Neurological Conditions - Alzheimer’s Disease
6.1 What is Alzheimer’s?
Alzheimer’s disease is a type of dementia that affects memory, behavior, and
cognition. Most people with Alzheimer’s are 65 and older. This disease worsens
with time and has no cure. After some time, there are difficulties with speak-
ing, swallowing, and walking, which can lead to difficulties living independently
and eventually death. (What Is Alzheimer’s Disease? Symptoms & Causes —
Alz.Org, n.d.).

6.2 What happens to the body/brain with Alzheimer’s?


With Alzheimer’s disease, patients struggle with brain atrophy. Brain atrophy
is a process where neurons are injured and die, connections between neurons
break down, and brain regions begin to shrink. So, as the disease progresses,
opportunities for neuroplasticity decrease. This process initially occurs in the
hippocampus (memory) and eventually in other regions involved with language,
reasoning, and social behavior. (NIH National Institute on Aging, 2017).

6.3 How can art therapy be useful in treating Alzheimer’s?


As I mentioned before, there is no cure for Alzheimer’s. Art therapy is not a cure
for Alzheimer’s but can alleviate symptoms and contribute to the well-being of
patients in the early stages of dementia (Hill et al., 2011). In a 2005 study,
Kinney & Rentz observed the well-being of individuals with Alzheimer’s during
an art program compared to other structured activities. To measure well-being,
there were six categories: interest, sustained attention, pleasure, negative affect,
sadness, and self-esteem. Each category had a description; for example, the
pleasure was defined as “Verbal expression of pleasure while participating in the
actual activity; eyes crinkled, smiles, laughter, relaxed facial expression; nods
positively, relaxed body language”. Before the art-making, observers measured
the well-being of participants once per week. They were given indicators for
each category of well-being; for example, indicators of pleasure were: “1. The
participant has relaxed body language, smiles, and laughs during the activity.
2. The participant verbalizes a sense of pleasure with phrases such as: “this
feels good,” “this is relaxing,” or in the “verbal expression of unintelligible
phrases such as oooh, aah, accompanied with smiles, crinkling of eyes, or relaxed
facial expression” (Kinney & Rentz, 2005, p. 224). Then, after the art-making
process, the researchers measured the patients’ well-being once again. The same
method was used for participants who did not complete the artwork but instead
another structured activity (e.g. following directions to make a toy plane).
For participants who completed artwork, participants demonstrated better well-
being in the categories of interest, attention, pleasure, and self-esteem. For
the other group, there was no observed negative affect or sadness when they
completed a structured activity (Kinney & Rentz, 2005). Even though this

15

121
study measured an abstract feeling of well-being, participants clearly showed
increased positive attitudes. So, although art therapy cannot cure Alzheimer’s,
it can lead to an improved quality of life because of self-expression and creativity.
Art therapy, however, may be able to slow the progress of the early stages of
Alzheimer’s. A potential use of art therapy for Alzheimer’s patients could be to
increase neuroplasticity in an attempt to strengthen neural connections (Koch
& Smampinato, 2022). However, there is little empirical research surrounding
this topic but should be pursued due to art therapy’s effect on neuroplasticity
as mentioned earlier.

7 Efficacy & Validity


Art therapy is not a cure for Alzheimer’s or PTSD but an effective form of
treatment. However, treating symptoms is also important. For example, art
therapy was useful for cancer patients with anxiety and depression. Cancer pa-
tients reported that their physical symptoms and mental health were alleviated
after participating in art therapy. So, art therapy can improve the quality of
life and treat some symptoms, but cannot cure, in this case, cancer (Hu et al.,
2021).
Art therapy is not limited to a certain population and is helpful for patients
of different ages and backgrounds. From the examples of studies I have given
throughout this paper, the ages of the participants differ drastically. Patients
could be children who faced childhood neglect or sexual abuse. Patients could
also be grandparents diagnosed with dementia. There are a wide variety of age
groups that can benefit from art therapy, and a number should not stop people
from participating. Art therapy also does not discriminate in terms of multi-
cultural differences. In a 2012 study, the sample included 133 Latino/Hispanic
participants who were often excluded from research on aging and cognition.
Results showed that cognitive functioning (memory, perception, learning, and
language abilities) improved significantly among the experimental group fol-
lowing 10 weeks of art therapy (Alders, 2012). Even in the previous studies I
mentioned, there was much diversity in ethnic backgrounds which did not make
a significant difference in end results.
However, the efficacy of art therapy can be hindered if treatment is sud-
denly stopped. A 2011 study measured the cognitive and psychological effect
of coloring and drawing in mild Alzheimer’s disease patients. During a 12-week
period of art coloring activities, some patients reportedly wandered around less
and increased their daily average sleep time from 4.5 to 7.9 hours. However,
once art therapy was stopped, the patients returned to their original sleep status
(Hattori et al., 2011). If clients want long-lasting effects, it is important that
they continually participate in art therapy.
Another limitation is that some art therapy studies may have a biased pool
of people. Those who participate in art therapy experiments may only agree
because they enjoy art therapy. However, those who do not like art therapy
may not participate. This limits the participant diversity and may create a bias

16

122
in favor of art therapy.

8 Conclusion
Art therapy, although a relatively new field, has the potential to make significant
developments in treating neurological conditions. The multitude of positive
psychological and neurological effects on the body as mentioned above supports
art therapy as an effective way to treat patients diagnosed with neurological
conditions such as PTSD and Alzheimer’s (Kinney & Rentz, 2005; Talwar,
2007).
Psychologically, art therapy works with images that do not solely require
verbal communication from the patient. This can help with PTSD, where ac-
cessing painful memories can be difficult (Meekums, 1999). Art as a way of
communicating can also help to uncover emotions, as shown by the pairing of
the Singer-Schachter theory of emotions with the two-part process of art ther-
apy (Yarwood, n.d.). Furthermore, memory can be strengthened through visual
communication (artwork) as shown by the picture superiority effect (Paivio &
Csapo, 1973). Physical materials/colors and images are also included in the
visual aspect of art therapy and can result in positive attitudes and emotions
(Holmes et al., 2006). In all, the visual aspect of art therapy is not only an alter-
native to talking, but it can contribute to better access to emotions, memories,
and more.
These psychological effects of art therapy can not only benefit PTSD and
Alzheimer’s patients but others as well. For example, patients with depression
can have improved moods. People with Parkinson’s can have strengthened
memory. People with anxiety may be able to regulate and control their emotions.
Many people suffer and die from these various neurological conditions every day,
but art therapy can help to minimize these struggles.
After the psychological effects of art therapy, I covered the neurological ef-
fects. I went on to explain the effects on the hippocampus and the prefrontal
cortex. Overall, the activation of certain areas during art therapy may be linked
to improving the function of these areas that are affected by certain neurological
conditions.
After the psychological and neurological effects of art therapy, I delved into
specific conditions that pertained to the positive results of art therapy. These
two conditions are PTSD and Alzheimer’s disease. PTSD results from exposure
to a traumatic experience and can result in stress, damage to the prefrontal
cortex, and impairments in the hippocampus (What Is Trauma?, 2018). Art
therapy can prove useful not only because it activates the areas that are impaired
during a traumatic event, but also because it can access the nonverbal traumatic
memory through artistic expression (Talwar, 2007). Alzheimer’s is a fatal type
of dementia that affects memory, behavior, and thinking and progresses over
time (What Is Alzheimer’s Disease? Symptoms & Causes — Alz.Org, n.d.). Art
therapy is not a cure for dementia, but it can improve well-being by improving
patients’ moods. There may also be connections with neuroplasticity, but there

17

123
has not been much empirical research done on this topic.
There are limitations to art therapy. Art therapy is not a cure for most
conditions. In PTSD, the trauma caused by the event can never be 100 per-
cent undone, but art therapy can improve the way that the patient deals with
his/her trauma. In dementia, art therapy cannot cure brain atrophy but can
improve well-being. Another possible downside of art therapy is that there can
be negative consequences if treatment is suddenly stopped (Hattori et al., 2011).
Another negative may be that some people may feel anger and reluctance to-
ward art therapy because they view themselves to be bad at art and find it
frustrating. A limitation of some art therapy studies is that there may be a
selection or participation bias. This bias may be caused by a stigma associated
with art therapy, and the idea that it is pseudoscience.
However, if art therapy is not attempted, people miss out on the opportuni-
ties for self-expression and neurological activation that are offered. Other types
of therapies, such as verbal-based therapies, that do not involve the creative
process may not result in the same outcomes as art therapy. Thankfully, art
therapy is a broad genre that includes many mediums and is adaptable to fit
a patient’s needs (Hu et al., 2021). Existing research already shows how art
therapy can help treat certain conditions, and there may be correlational evi-
dence that art therapy can indirectly prevent deaths. For example, patients who
are diagnosed with depression may be suicidal, and since art therapy improves
well-being, it may help to prevent suicides. Also, if someone is feeling suicidal
because of poor social relations, art therapy is a great resource to improve rela-
tionships and feelings of social acceptance. Students dealing with a lot of stress
can reduce their cortisol levels through art therapy and reduce the risk of dying
from heart disease (Stress Can Increase Your Risk for Heart Disease - Health
Encyclopedia - University of Rochester Medical Center, n.d.). The implications
of art therapy are numerous, but to uncover more we should encourage more
people to engage with this research.

References
[All09] Gold C. A. Budson A. E. Ally, B. A. The picture superiority effect
in patients with alzheimer’s disease and mild cognitive impairment.
Neuropsychologia, 2009.
[Arc] Stress disrupts the architecture of the developing brain.
[Arn09] A.F.T. Arnsten. Stress signaling pathways that impair prefrontal cor-
tex structure and function. Nature Reviews, 2009.
[Bel14] Van Hecke A. Konopka L. Belkofer, C. Effects of drawing on alpha
activity: A quantitative eeg study with implications for art therapy.
Art Therapy, 2014.
[Bel22] Van Hecke A. Konopka L. Belkofer, C. Research review shows self-
esteem has long-term benefits. UC Davis, 2022.

18

124
[Bre99] Staib L. H. Kaloupek D. Southwick S. M. Soufer R. & Charney D. S.
Bremner, J. D. Research review shows self-esteem has long-term ben-
efits. Biological Psychiatry, 1999.

[Cau] What is alzheimer’s disease? symptoms causes. alz.org.


[CC16] & Weihs K. L. Czamanski-Cohen, J. The bodymind model: A plat-
form for studying the mechanisms of change induced by art therapy.
The Arts in Psychotherapy, 2016.

[Cha18] Jarvis H. Law C.-K. Chong T. T.-J. Chau, B. K. H. Dopamine and


reward: A view from the prefrontal cortex. Behavioural Pharmacology,
2018.

[Chi] Inbrief: The impact of early adversity on children’s development. Cen-


ter on the Developing Child at Harvard University.

[Cot85] M. A. Cotton. Creative art expression from a leukemic child. Art


Therapy, 1985.

[Cue07] G. Cuellar. An art therapy group program for elementary school


studentss struggling with self-expression, communication, and social
skills. Art Therapy, 2007.

[def] art therapy def—google search. Google.

[Dir] Corpus callosum—an overview — sciencedirect topics. Science Direct.

[Duf13] Kurczek J. Rubin R. Cohen-N. J. Tranel D. Duff, M. C. Hippocampal


amnesia disrupts creative thinking. Hippocampus, 2013.

[Eme] M. Emery. Occupational Therapy and Rehabilitation.

[Hat11] Hattori C. Hokao C. Mizushima K.- Mase T. Hattori, H. Controlled


study on the cognitive and psychological effect of coloring and drawing
in mild alzheimer’s disease patients. Geriatrics Gerontology Interna-
tional, 2011.
[Hil11] Kolanowski A. M. Gill-D. J. Hill, N. L. Plasticity in early alzheimer’s
disease: An opportunity for intervention. Topics in Geriatric Reha-
bilitation, 2011.
[Hol06] Mathews A. Dalgleish T. Mackintosh B. Holmes, E. A. Positive in-
terpretation training: Effects of mental imagery versus verbal training
on positive mood. Behavior Therapy, 2006.
[Hu21] Zhang J. Hu L. Yu H. Xu J. Hu, J. Art therapy: A complementary
treatment for mental disorders. Frontiers in Psychology, 2021.

19

125
[Kai17] Ayaz H. Herres J. Dieterich-Hartwell R. Makwana B. Kaiser D. H.
Nasser J. A. Kaimal, G. Functional near-infrared spectroscopy assess-
ment of reward perception based on visual self-expression: Coloring,
doodling, and free drawing. The Arts in Psychotherapy, 2017.

[Kin19a] Kaimal G. King, J. L. Approaches to research in art therapy using


imaging technologies. Frontiers in Human Neuroscience, 13, 2019.

[Kin19b] Kaimal G. King, J. L. Approaches to research in art therapy using


imaging technologies. Frontiers in Human Neuroscience, 13, 2019.

[Kon13] Chen K. Tang Y. Wu-F. Driesen N. Womer F. Fan G. Ren L. Jiang


W. Cao Y. Blumberg H. P. Xu K. Wang F. Kong, L. Functional con-
nectivity between the amygdala and prefrontal cortex in medication-
naive individuals with major depressive disorder. Journal of Psychia-
try Neuroscience, 2013.

[Kon14] L. M. Konopka. Where art meets neuroscience: A new horizon of art


therapy. Croatian Medical Journal, 2014.
[LW] R. Lev-Wiesel. The self-revelation through color technique: Under-
standing clients39; relations with significant others, silent language,
and defense mechanisms through the use of color.
[Lá17] Beke A. M. Lábadi, B. Mental state understanding in children with
agenesis of the corpus callosum. Frontiers in Psychology, 2017.

[Mal03] C. A. Malchiodi. Handbook of art therapy. Guilford Press, 2003.


[Men17] Schücker L. Hagemann N.- Strauss B. Mentzel, S. V. Emotionality
of colors: An implicit link between red and dominance. Frontiers in
Psychology, 2017.

[NIH23] Reduction of cortisol levels and participants’ responses following art


making—pmc. NIH, 2023.
[Pai73a] Csapo K. Paivio, A. Picture superiority in free recall: Imagery or
dual coding? Cognitive Psychology, 1973.

[Pai73b] Csapo K. Paivio, A. Picture superiority in free recall: Imagery or


dual coding? Cognitive Psychology, 1973.

[Pow22] The power of self-expression. Drawchange, 2022.

[Pud23] Emmady P. D. Puderbaugh, M. Neuroplasticity. StatPearls, 2023.

[res18] Understanding art therapy vs art as therapy. Art Therapy Resources,


2018.
[Roc] Stress can increase your risk for heart disease—health encyclope-
dia—university of rochester medical center. Health Encyclopedia.

20

126
[Tal07] S. Talwar. Accessing traumatic memory through art making: An art
therapy trauma protocol (attp). The Arts in Psychotherapy, 2007.

[Tal18] S. Talwar. What is trauma? Trauma-Informed Care Implementation


Resource Center., 2018.
[Tra] How does trauma affect the brain? - and what it means for you. Whole
Wellness Therapy.

[Yar] M. Yarwood. Schachter-singer two-factor theory.

[Zha14] Zhuo C. Lang X. Li-H. Qin W. Yu C. Zhang, Q. Structural impair-


ments of hippocampus in coal mine gas explosion-related posttrau-
matic stress disorder. PLoS ONE, 2014.

21

127
Detecting Causality by Using Alexander
Quandles and Alexander-Conway Polynomial

Nikhila Pasam
October 16, 2023

Abstract
The paper by Samantha Allen and Jacob H. Swenberg suggests that
the Jones polynomial is likely able to detect causality in 2+1-dimension
global hyperbolic spacetime; however, the Alexander-Conway polynomial
cannot. The natural question that arises then is what extra information
needs to be added to the Alexander-Conway polynomial so that it can
also distinguish causality. In this paper, I used some of the Alexander
Quandles for the connected sum of 2 Hopf links and the Allen-Swenberg
link and obtained the result that it does not distinguish between the two
links, so it cannot detect causality.

1 Introduction
In a globallyhyperbolic spacetime
 X, which has (2+1) dimensions and is in
the form of ×R, where is a Cauchy surface homeomorphic to R2 , we can
define NX as the space that contains all light rays within X. These light rays
can be represented using a solid torus. When a point P ∈ X is considered,
a light cone intersects ×R in a circular curve, defining a knot in the solid
torus (SP of P ). According to the Low Conjecture, as proved by Chernov
and Nemirovski [VC20], two points P and Q are causally related if and only
SP and SQ are linked within NX . Therefore, link invariants that distinguish
whether SP and SQ are linked within NX can detect causality. Findings by
Allen and Swenberg [Joy82] suggest that the Jones polynomial is likely able to
detect causality, while the Alexander-Conway polynomial may be insufficient.
They identified a link that relates to possibly causally connected events, which
the Alexander-Conway polynomial was unable to distinguish. In this paper,
I check whether the Alexander Quandle can distinguish the two examples of
Allen-Swenberg.
∗ Mission San Jose High School, Advised by: Vladimir Chernov of Dartmouth College and

Emanuele Zappala of Yale University

128
2 Quandles and Cocycles
2.1 Quandle [Ame07] or [Cro04]
A quandle is a set X with an operation ▷ satisfying the properties:

1. x ▷ x = x for all x ∈ X;
2. For all x, y ∈ X, there exists a z such that x = z ▷ y
3. (x ▷ y) ▷ z = (x ▷ z) ▷ (y ▷ z) for all x, y, z ∈ X, which is called
self-distributivity.

It is also implied that there is an inverse operation, ▷ −1 , such that


(x ▷ y) ▷ −1 y = x for all x, y ∈ X. To define the knot quandle, we assign a letter
for each arc. The relationship at each crossing is shown as below:

Figure 1: Quandles at crossings

The figure on the left shows that the arc that is labeled x crosses under the
arc labeled y from left to right; therefore, the result is x ▷ −1 y. The diagram
on the right shows that the arc that is labeled x crosses under the arc labeled y
from right to left; therefore, the result is x ▷ y. To verify that the knot quandle
is an invariant of knot, we check that the Reidemeister moves don’t change the
quandle colorings.

Figure 2: Quandles after Reidemester moves

129
If we have a knot diagram with labels on its arcs based on a quandle, these labels
have a specific rule for the crossings. Before and after a certain move, there must
be a bijection of the labelings. By comparing the number of labelings, we can
determine whether the diagrams represent the same knot or different ones. If
the numbers are equal, there’s no conclusion; if the numbers are different, the
diagrams correspond to different knots.

2.1.1 Fundamental Quandle


A fundamental quandle is a set of expressions used to represent a link through
relationships at crossings. This makes the fundamental quandle a highly effec-
tive tool to compute quandle colorings because it serves as distinctive invariants
for certain links.

2.1.2 Alexander Quandle


The Alexander quandle is an example of a quandle. It is constructed through
the set of Zn of integers modulo n and a t value which is co-prime to n. The
quandle is defined by:

x ▷ y = tx+ (1 − t) y

If we do affine the Alexander quandle over Zp then the cocycle would be (x−y)pr .
[CN10]

2.2 Colorings
A coloring is an assignment of elements from a quandle X to the arcs of a
knot diagram, with the property that undercrossing is compatible with the ▷
operation.

2.3 Cocycles
Let X be a quandle, and take A = Zn for some n. We want to enhance the
coloring invariants using the notion of cocycle (which is part of the theory of
Cohomology). A cocycle ϕ of X with coefficients in A is a function
ϕ : X × X → A satisfying the condition (for all x, y , z):
ϕ(x, z) + ϕ(x ▷ z, y ▷ z) = ϕ(x, y ) + ϕ(x ▷ y, z).

2.4 Boltzmann Weights


Let ϕ be a 2-cocycle of X with coefficients in A. Fix a coloring of a diagram D
by X. At each crossing consider the element of A given by

130
Figure 3: Cocycle relation at crossing


Define Bϕ(C) = crossings ± ϕ (x, y ). This is called Boltzmann weight. The
sign is defined ± 1 for positive and negative crossings, respectively. Then the
cocycle invariant of the knot K (with diagram D) is given by the formal sum
of Boltzmann weights:

ψϕ (K) = C Bϕ(C), where C runs over all colorings.

3 Causality Using Alexander Quandles


Let’s calculate the Alexander quandle with Z5 and t = 2 for the connected sum
of 2 Hopf links and the Allen Swenberg link.
X ▷ Y = 2X–1Y = 2X + 4Y , mod 5.
If we do affine the Alexander quandle over Z5 then the cocycle would be (x−y)5r .
We first label all the crossings and arcs in the connected sum of 2 Hopf links
diagram.

Figure 4: Connected sum of 2 Hopf links

Now, we can apply the Alexander quandle operation. The results are shown in
the table below.

131
Crossings Alexander Quandle
1 Y1 = Y1 ▷ Y3 = 2Y1 + 4Y3
2 Y2 = Y3 ▷ Y4 = 2Y3 + 4Y4
3 Y3 = Y2 ▷ Y1 = 2Y2 + 4Y1
4 Y4 = Y4 ▷ Y2 = 2Y4 + 4Y2

After solving the above system of equations, we obtain the following relation:
Y4 = Y2
Y1 = Y2
Y3 = Y2
Therefore, Y1 = Y2 = Y3 = Y4
Since all the colors are the same, it means that the number of colors is equal to
the number of elements of Z5 , which is 5.
We can apply the same process to the Allen Swenberg link. First, let’s label all
the crossings and arcs.

Figure 5: Labeled Allen-Swenberg link

132
Crossings Alexander Quandle
1 Y45 = Y2 ▷ Y1 = 2Y2 + 4Y1
2 Y2 = Y3 ▷ Y4 = 2Y3 + 4Y4
3 Y38 = Y4 ▷ Y3 = 2Y4 + 4Y3
4 Y39 = Y38 ▷ Y4 = 2Y38 + 4Y4
5 Y4 = Y5 ▷ Y39 = 2Y5 + 4Y39
6 Y40 = Y39 ▷ Y5 = 2Y39 + 4Y5
7 Y5 = Y6 ▷ Y40 = 2Y6 + 4Y40
8 Y16 = Y40 ▷ Y6 = 2Y40 + 4Y6
9 Y6 = Y7 ▷ Y16 = 2Y7 + 4Y16
10 Y8 = Y17 ▷ Y7 = 2Y17 + 4Y7
11 Y18 = Y17 ▷ Y16 = 2Y17 + 4Y16
12 Y10 = Y11 ▷ Y16 = 2Y11 + 4Y16
13 Y12 = Y11 ▷ Y7 = 2Y11 + 4Y7
14 Y16 = Y15 ▷ Y10 = 2Y15 + 4Y10
15 Y7 = Y14 ▷ Y10 = 2Y14 + 4Y10
16 Y10 = Y9 ▷ Y12 = 2Y9 + 4Y12
17 Y41 = Y9 ▷ Y8 = 2Y9 + 4Y8
18 Y8 = Y15 ▷ Y18 = 2Y15 + 4Y18
19 Y12 = Y14 ▷ Y18 = 2Y14 + 4Y18
20 Y18 = Y13 ▷ Y12 = 2Y13 + 4Y12
21 Y19 = Y13 ▷ Y8 = 2Y13 + 4Y8
22 Y43 = Y24 ▷ Y30 = 2Y24 + 4Y30
23 Y29 = Y24 ▷ Y23 = 2Y24 + 4Y23
24 Y23 = Y25 ▷ Y29 = 2Y25 + 4Y29
25 Y30 = Y26 ▷ Y29 = 2Y26 + 4Y29
26 Y19 = Y20 ▷ Y30 = 2Y20 + 4Y30
27 Y21 = Y20 ▷ Y 23 = 2Y20 + 4Y23
28 Y37 = Y25 ▷ Y21 = 2Y25 + 4Y21
29 Y27 = Y26 ▷ Y21 = 2Y26 + 4Y21
30 Y23 = Y22 ▷ Y37 = 2Y22 + 4Y37
31 Y21 = Y22 ▷ Y27 = 2Y22 + 4Y27
32 Y30 = Y28 ▷ Y37 = 2Y28 + 4Y37
33 Y29 = Y28 ▷ Y27 = 2Y28 + 4Y27
34 Y36 = Y37 ▷ Y27 = 2Y37 + 4Y27
35 Y27 = Y31 ▷ Y36 = 2Y31 + 4Y36
36 Y35 = Y36 ▷ Y31 = 2Y36 + 4Y31
37 Y31 = Y32 ▷ Y35 = 2Y32 + 4Y35
38 Y34 = Y35 ▷ Y32 = 2Y35 + 4Y32
39 Y32 = Y33 ▷ Y34 = 2Y33 + 4Y34
40 Y34 = Y33 ▷ Y3 = 2Y33 + 4Y3
41 Y1 = Y3 ▷ Y33 = 2Y3 + 4Y33
42 Y44 = Y42 ▷ Y41 = 2Y42 + 4Y41
43 Y41 = Y43 ▷ Y42 = 2Y43 + 4Y42
44 Y42 = Y44 ▷ Y1 = 2Y44 + 4Y1
45 Y1 = Y45 ▷ Y42 = 2Y45 + 4Y42
6

133
To calculate the number of solutions, I inputted the system of equations into
Wolfram Mathematica and obtained the following:

Figure 6: Solution provided by Wolfram Mathematica

The system shows that all the variables are equal to each other; therefore, the
number of colors is equal to the number of elements of Z5 , which is 5. Since the
coloring invariant of the connected sum of 2 Hopf links is equal to the coloring
invariant of the Allen Swenberg link, this invariant doesn’t distinguish between
the two links.

4 Using Other Values For t and n


I ran the same computations using different values for t and n and got the same
number of solutions for both the connected sum of 2 Hopf links and the Allen
Swenberg link.

t = 3, n = 5 → 5 monochromatic solutions
t = 4, n = 5 → 5 monochromatic solutions
t = 2, n = 7 → 7 monochromatic solutions
t = 3, n = 7 → 7 monochromatic solutions
t = 4, n = 7 → 7 monochromatic solutions
t = 5, n = 7 → 7 monochromatic solutions

In these cases, the number of solutions is equal to the n value of Zn , no


matter the value of t. This is because we are finding trivial solutions for the
colorings, and it means that there are exactly n number of colorings because Zn
has n elements.

5 Conclusion
The number of quandle coloring invariants for the connected sum of 2 Hopf
links and Allen Swenberg link are the same for different values of n and t. The
results show that the Alexander quandle paired with the Alexander-Conway
Polynomial does not contain enough information to detect causality. Since we
affine Alexander quandle over Z5 , then the cocycle would (x − y)5r . However,
they would not help since according to my computations all the quandle colors
are the same.

134
6 Acknowledgement
This research was conducted under the supervision of Professor Vladimir Cher-
nov of Dartmouth College and Professor Emanuele Zappala of Yale University
through the Horizon Academic Research Program in the summer of 2023. I
give thanks to Professors Vladimir Chernov and Emanuele Zappala for giving
me this opportunity and for their support as my mentors.

References
[Ame07] Kheira Ameur. Polynomial Quandle Cocycles, Their Knot Invariants
and Applications. PhD thesis, University of South Florida, 2007.

[AS21] Samantha Allen and Jacob H. Swenberg. Do link polynomials detect


causality in globally hyperbolic spacetime? Journal of Mathematical
Physics, 62(3):032503, 2021.

[CN10] Vladimir Chernov and Stefan Nemirovski. Legendrian links, causal-


ity, and the low conjecture. Geometric and Functional Analysis,
19(5):1320–1333, 2010.
[Cro04] Peter Cromwell. Knots and Links. Cambridge University Press, 2004.

[Joy82] David Joyce. A classifying invariant of knots, the knot quandle. Jour-
nal of Pure and Applied Algebra, 23(1):37–65, 1982.
[Mat84] Matveev. Distributive groupoids in knot theory. Mathematics of the
USSR-Sbornik, 47(1):73–83, 1984.

[Nel] Sam Nelson. Quandles and racks. https://www1.cmc.edu/pages/


faculty/VNelson/quandles.html.
[VC20] Ina Petkova Vladimir Chernov, Gage Martin. Khovanov homology and
causality in spacetime. Journal of Mathematical Physics, 61(2):022503,
2020.

135
Community Detection in Dynamic Face-to-Face Interaction
Networks: A Louvain Algorithm Approach

Iroda Ibrohimova
October 13, 2023

Abstract
In this paper, we present a study that evaluates the suitability of the Louvain Algorithm in
the context of face-to-face interaction networks. Traditional community detection methods face
challenges in this context, necessitating specialized solutions. Our research addresses this gap,
offering a systematic approach that aggregates individual game data and applies the Louvain
Algorithm. The results demonstrate the algorithm’s effectiveness in consistently identifying the
original 34 communities, demonstrating its relevance in face-to-face interaction networks.

1 Introduction
Social networks are interconnected individuals or entities characterized by relationships, interac-
tions, and information flow. They indicate key webs of human interactions and help to understand
these interactions’ dynamics [New10]. These qualities make them potential grounds for applying
machine learning techniques to discover patterns and structures within communities [WF94]. With
its capacity to sift through vast datasets, machine learning offers a powerful tool to dissect and
comprehend the complexities of social networks, enabling insights into human behavior, influence
propagation, and community formation [Lea09], [CGP12].
Community detection is a widely employed technique in graph analysis aimed at partitioning
vertices within a graph into coherent ”communities” based on their relatedness. It serves as a
valuable tool in various scientific and industrial fields, including biology, social networks, finance,
and literature analysis, aiding in discovering meaningful structural patterns. Comprehensive reviews
on the different formulations, methods, and applications of community detection can be found in
Michele Coscia, Fosca Giannotti, Dino Pedreschi, and Santo Fortunato [For10], [KN11]. Various
measures have been proposed to evaluate the goodness of partitioning produced by a community
detection method [KS88], [NG04].
Among the methods used, modularity stands out due to its widespread application. Introduced
by Newman [NG04], modularity quantifies the quality of community assignments by examining the
proportion of edges within communities. However, modularity has constraints, including a resolution
limit [FB07]. Nonetheless, it remains a popular choice for practitioners, and resolution-limit-free
variations have been suggested.
Numerous efficient heuristics have been developed over the years, making the analysis of large-
scale networks feasible in practice. The Louvain method, a highly efficient heuristic proposed by
∗ Advised by: Dr Maria Konte

136
Blondel et al. [BGLL08], has gained prominence for its speed and the quality of results it provides
in practice.
Despite the widespread use of community detection methods, applying them to face-to-face
interaction networks remains challenging. These networks, characterized by limited data and direct,
physical interactions, differ significantly from virtual networks. This uniqueness necessitates tailored
approaches. In the context of face-to-face interaction networks, the Louvain algorithm can be an
effective method for community detection.
The objective of this study is to comprehensively assess the suitability of the Louvain Algorithm
in the context of face-to-face interaction networks. These networks present distinctive challenges
involving sparse data due to limited observation opportunities. Traditional community detection
methods may not readily adapt to these conditions, necessitating specialized solutions.
This research addresses the need for effective community detection in face-to-face networks.
The complexity arises from the need to extract meaningful patterns from a web of brief, physical
interactions, a task conventional methods struggle with. Furthermore, the urgency of solving this
problem stems from the increasing interest in comprehending real-world, physical interactions across
diverse contexts, from workplaces to social gatherings.
To address the challenges described above, we have developed a systematic approach. We aggre-
gated individual game data and applied the Louvain Algorithm to detect communities within each
game. Subsequently, we used the results to assess whether the algorithm consistently identifies the
original 34 communities. This process serves as a rigorous evaluation of the algorithm’s effectiveness
and underscores its relevance within face-to-face interaction networks.
Contributions: we make the following contributions in this paper: 1. Enhanced Resolution
Parameter: Our study introduces a refined resolution parameter setting (8) specifically tailored
to the dynamic nature of face-to-face interaction networks. 2. Methodological Framework: The
research establishes a comprehensive methodological framework for analyzing face-to-face interaction
networks. 3. Specialized Application of the Louvain Algorithm: This research employs the tailored
application of the Louvain Algorithm to face-to-face interaction networks, a domain with distinct
observational constraints and network characteristics.

2 Related Works
There have been notable efforts to address community detection within social networks, with a
focus on various methodologies [BBPSV18], [PLR19], [PBM20], [MMP19], [BCT+ 14]. For instance,
Barrat et al. studied temporal multilayer networks, shedding light on the social structure of face-
to-face interaction [BBPSV18]. Their work significantly advances our understanding of dynamic
network interactions, primarily revolving around temporal aspects. Similarly, Peralta, Loslever, and
Ramos contributed substantially to the field by examining community detection in social networks
and provided critical insights into the intricate interplay of social dynamics, with a focus on broader
virtual networks [PLR19].
In face-to-face proximity networks, Puglisi, Bullo, and Mantzaris [PBM20] dived deeper into the
application of stochastic block models and community detection, providing a significant step toward
understanding proximity-based social interactions. Their approach centers on a specific modeling
framework. Conducting a data-driven study on community detection in these networks, Morone et al.
[MMP19] offer valuable insights into the underlying structures, primarily focusing on the structural
aspects. Additionally, Barrat and colleagues [BCT+ 14] lay a significant groundwork by exploring
community detection within temporal multilayer networks. This work provides a foundation for
understanding social structures in face-to-face interactions, particularly temporal aspects.

137
In comparison, our study uniquely addresses the interplay of face-to-face interaction networks.
By employing a systematic approach that encompasses data aggregation, algorithm application, and
parameter optimization, we present a comprehensive framework tailored to this distinct domain.

3 Data Preparation
3.1 Dataset Description
The dataset used in this research paper was obtained from a series of face-to-face interaction games
called Resistance, conducted as part of the research study [Lesnd]. Face-to-face interactions were
extracted from videos of participants playing the game. Dynamically evolving networks were ex-
tracted from the free-form discussions using the ICAF algorithm. The extraction algorithm is a
collective classification algorithm that leverages computer vision techniques for eye gaze and head
pose extraction.
Each game had 5–9 participants and lasted 45-60 minutes. Each participant was part of exactly
one game. In total, the dataset had 34 games and 232 participants [KBSL21].

Figure 1: Given a group video conversation (left), we extracted face-to-face dynamic interaction
networks (right), representing the instantaneous interactions between participants. Participants are
nodes, and interactions are edges in the network.

The dataset is in CSV format, with one file for each game.
The networks are weighted, directed, and temporal. In the Excel spreadsheet, there are columns
containing information about timestamps. The behavior was recorded for every 1/3 seconds, leading
to a total of 9650 1/3 seconds. In addition, there are 65 rows in each file. Every third of a second, a
line is drawn from one person (node 1) to another (node 2). The strength of this line is determined
by how likely it is that person ’1’ is looking at person ’2’ or at the laptop.

3.2 Data Cleaning


The data was available in a zip file containing binary and weighted data. The latter type of data
was most suitable, as it captures the strength and intensity of communication, which in our case is
the likelihood of looking. Therefore, all files with binary data were extracted from the zip file.
It should be noted that the dataset did not have any instances of gaps in values that demonstrate
the validity of the recorded data and simplified following analytical procedures.

138
Figure 2: In the figure, the main elements and variables of the dataset can be observed (TIME,
P1 TO P2 , etc.)

3.3 Data Processing


Several steps have been taken to test whether the Louvain Algorithm is adequate for identifying
communities within dynamic face-to-face interaction networks, and the code was adopted to address
them. First, data is presented in Excel spreadsheet format; in each, participants are always in order
from 1 to 7, even though any participant can participate only once in a game. Therefore, having
similar and repetitive names can interfere with the algorithm’s work, so the code was adjusted to
simulate interactions between different sets of participants across multiple games. This was done by
keeping track of the participant numbers and incrementing them for each new group of participants.
To illustrate, in the first file, participants were labeled from 1 to 7, while in the second file (2nd
game), participants were labeled from 8 to 15, and this chain continued until it reached the last
232nd participant.

4 Methodology
4.1 Network Construction
Participants and their interactions were represented as nodes and edges in constructing the network.
Each participant and the laptop were represented as nodes. Nodes were labeled according to the
order number of the gamer. Edges were defined by the calculated probabilities of participants looking
at each other.
This approach relied on face-to-face interactions, with visual attention being an essential aspect
of engagement. To define edges, the weighted interaction values were considered, where if the nodes’
interaction value equaled 0, no connection (edge) was created. In contrast, with higher values, the
distance between nodes decreased, placing participants closer to each other and forming clusters.

139
4.2 Graph Analysis and Visualization
NetworkX and Matplotlib were used for analyzing and visualizing these dynamic networks.
NetworkX is a Python library designed to create, manipulate, and study complex networks.
In our study, it served as a foundational tool for understanding the intricate web of interactions
among participants. It allowed us to represent these interactions as nodes (representing participants)
and edges (indicating connections between them). This facilitated a comprehensive analysis of the
dynamic face-to-face interaction networks. [HSS08].
Matplotlib is a versatile data visualization library in Python. It allowed us to translate raw
data into insightful visual representations. Our research used Matplotlib to render the network
graphs derived from NetworkX. This enabled us to visually explore and communicate the patterns
and dynamics inherent in the interactions, offering a clear and interpretable representation of the
complex dataset. [Hun07]

4.3 The Louvain Community Detection Algorithm


The algorithm is divided into two phases that are repeated iteratively. In the first phase, each
node is initially assigned to its community, resulting in as many communities as there are nodes
in the network. Then, for each node, the algorithm considers its neighbors and evaluates the gain
of modularity that would occur by removing the node from its community and placing it in the
community of its neighbor. The node is then placed in the community, which results in the maximum
gain of modularity, but only if this gain is positive. The node remains in its original community
if no positive gain is possible. This process is repeated iteratively for all nodes until no further
improvement in modularity can be achieved. [BGLL08]
The modularity formula is expressed as:
 
1  ki kj
Q= Aij − δ(ci , cj )
2m ij 2m

where:

Q is the modularity,
m is the total number of edges in the network,
Aij is the weight of the edge between nodes i and j,
ki and kj are the degrees of nodes i and j, respectively,
ci and cj are the communities to which nodes i and j belong,
δ(ci , cj ) is the Kronecker delta function, which is equal to 1 if ci = cj and 0 otherwise.

In simpler terms, the modularity formula measures the difference between the actual number of
edges within communities and the expected number of edges if the edges were distributed randomly.
A higher modularity value indicates a better partition of the network into communities. The
modularity formula is a critical component of the Louvain method, as it is used to evaluate the
quality of the communities detected in every iteration of the algorithm. [BGLL08]
The second phase of the Louvain method involves building a new network whose nodes are the
communities found during the first phase. To do this, the weights of the links between the new
nodes are given by the sum of the weights of the links between nodes in the corresponding two
communities. Links between nodes of the same community lead to self-loops for this community in
the new network. Once this second phase is completed, the first phase of the algorithm is reapplied

140
Figure 3: The illustration shows the stages involved in an overview of the Louvain Community
Detection Algorithm.

to the resulting weighted network and iterated. A combination of these two phases is denoted as
a ’pass.’ By construction, the number of meta-communities decreases at each pass, resulting in a
hierarchy of communities. [BGLL08]
Several advantages of the Louvain Algorithm fit the context of our research paper very well. In
our dataset, edges represent the strength of interactions. Therefore, the Louvain Algorithm is a great
fit because it can efficiently identify communities in networks with weighted connections [BGLL08]..
Secondly, the Louvain Algorithm is adept at handling interaction data involving clear directions,
ensuring that it accurately captures the interaction flow [BGLL08]. Finally, the Louvain algorithm
performs better than many other algorithms in terms of modularity, which is crucial in identifying
communities [CGar].

4.4 Resolution Adjustment and Refinement


Modularity optimization has a major disadvantage when it comes to identifying communities smaller
than a particular scale [FB07]. When the algorithm encounters them, it can add these small groups
to bigger communities, so the number of communities can be inaccurate. The resolution plays a
crucial role in this process. It determines the level of granularity at which communities are discerned.
A lower resolution value (e.g., 0.1) tends to identify larger, more encompassing communities. This
implies that nodes with weaker connections may also be aggregated together. Conversely, a higher
resolution value (e.g., 1.0) leads to the detection of smaller, more closely-knit communities, where
nodes with stronger connections are more likely to be grouped.
As our dataset contained very unstable data points, in which unusually low values were also
present, it was important to choose the relevant resolution. We systematically varied the resolu-
tion value through iterative experimentation to ascertain its impact on community detection within
the dynamic face-to-face interaction networks. Notably, an enhanced resolution value of 8 yielded

141
markedly improved outcomes: this refined resolution allowed for a more granular delineation of
communities, capturing nuanced interaction patterns previously obscured. The adjustments facili-
tated the identification of more distinct groups and provided a deeper understanding of participant
affiliations.

4.5 Setting up the process


In order to verify the validity and relevance of the Louvain Algorithm, it was applied to our dataset
collected from a face-to-face interaction network.
In the context of the paper, each participant was analogous to a vertex in the algorithm. For
every participant u, the algorithm scrutinized each connected participant v. It assessed the potential
gain in modularity (∆Q) if u were to shift from its current community c to v’s community c′ - the
participant u was placed in the community ĉ that offered the maximum ∆Q. If no positive gain in
modularity was observed, u stayed within its original community (Lines 7–15, Figure 4). This process
continued until no further enhancement was possible, resulting in the determination of community
information (C ′ ) and modularity Q (Lines 19–22, Figure 4).
The second phase involved constructing a fresh graph wherein the vertices represent the commu-
nities identified in the prior phase. The new vertex set V’ comprised the latest communities, and the
edge weights between these new vertices were determined by aggregating the weights of the edges
between participants in the respective communities.
In our case, edges linking participants within the same community resulted in loops for that
community within the new graph (Lines 24–26, Figure 4). This iterative process was repeated until
the communities stabilized. [BGLL08]

4.6 Visualizing
After applying the Louvain Algorithm, we visualized and analyzed the identified communities. This
involved generating a graphical representation of the network. Each node (participant) was dis-
played, with edges (interactions) connecting them. This visual layout provided an intuitive overview
of how participants interact. Similarly, nodes were assigned distinct colors based on the communities
they belong to. This visual cue helps distinguish different groups of participants who exhibit similar
interaction patterns.

4.7 Clustering
In these dynamic networks, participants’ interactions are complex and nuanced. The Louvain Al-
gorithm identifies communities based on how participants interact more frequently with each other
compared to those outside their communities. However, this raw community information can still
be quite intricate, especially given the nature of our data.
Clustering takes the detected communities a step further by grouping participants who exhibit
similar interaction behavior into distinct and coherent clusters. This process provides a more precise,
more intuitive representation. By employing clustering, we effectively organized participants into
manageable groups, each characterized by shared interaction patterns.

5 Results
The Louvain Algorithm for Community Detection was applied to the dataset obtained from the
game that involved face-to-face interactions. The dataset was cleaned from binary data, participants

142
were labeled in a continuous order, and the inconsistencies, such as resolution, were addressed and
maintained to yield accurate data.
The application of the Louvain Algorithm revealed a total of 34 (from 0 to 33) distinct commu-
nities within the dynamic face-to-face interaction networks. These communities exhibited varying
sizes, ranging from 5 to 9 participants.

Community Order Number Participant Number


0 [1, 2, 3, 5, 6, 7, 4]
1 [LAPTOP]
2 [176, 177, 179, 180, 181, 182, 178]
3 [183, 184, 185, 186, 187, 188, 189]
4 [190, 191, 192, 193, 194, 195, 196, 197, 201]
5 [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
6 [198, 199, 200, 202, 203, 204]
7 [71, 72, 73, 74, 75, 76, 77]
8 [22, 23, 24, 25, 26, 27, 28]
9 [57, 58, 59, 60, 61, 62]
10 [170, 171, 172, 173]
11 [163, 164, 165, 166, 167, 168, 169]
12 [156, 157, 158, 159, 160, 161, 162]
13 [148, 149, 150, 151, 152, 153, 154, 155]
14 [142, 143, 144, 146]
15 [134, 135, 136, 137, 138, 139, 140, 141, 145]
16 [233, 234, 235, 236, 237, 238, 239]
17 [127, 128, 129, 130, 131, 132, 133]
18 [120, 121, 122, 123, 124, 125]
19 [114, 115, 116, 118, 119]
20 [108, 109, 110, 111, 112, 113]
21 [99, 100, 101, 102, 103, 104, 105, 106, 107]
22 [92, 93, 94, 95, 96, 97, 98]
23 [226, 227, 228, 229, 230, 231, 232]
24 [85, 86, 87, 88, 89, 90, 91]
25 [78, 79, 80, 81, 82]
26 [64, 65, 66, 67, 68, 69, 70]
27 [219, 220, 221, 222, 223, 224, 225]
28 [51, 52, 53, 54, 55]
29 [211, 212, 213, 214, 215, 216, 217, 218]
30 [43, 44, 45, 46, 47, 48, 49, 50]
31 [37, 38, 39, 40, 41, 42]
32 [205, 206, 207, 208, 209, 210]
33 [29, 30, 31, 32, 33, 34, 35, 36]

Table 1: Community Order Numbers and Participant Numbers

143
6 Conclusion
After close examination, it was observed that communities exhibited different interaction patterns.
Some communities demonstrated a higher density of interactions, indicating a stronger cohesion
among members. In contrast, others exhibited a pattern characterized by irregular exchanges,
indicative of a more diffuse and loosely knit network structure.
Also, it should be noted that resolution 8 in the Louvain Community Detection Algorithm was
the most suitable resolution level for our data.
As the results confirmed that the Louvain Algorithm, given the right resolution, can accurately
detect communities in face-to-face interaction networks, we can say that this study provides a pivotal
step toward advancing our understanding of dynamic face-to-face interaction network communities.

6.1 Future Prospect


Looking forward, our research can lay the foundation for a number of potential future investigations.
Analyzing communities within these networks could reveal deeper insights into the dynamic nature
of human interactions in the platforms with both face-to-face and virtual communication elements.
A good example of further research could be studying the evolution of the communities based on
their participants’ face-to-face interaction in online settings such as games. Furthermore, the scope
of the study can be extended to incorporate multimodal data, such as audio and non-verbal cues, so
that a more holistic view of this type of interaction can be achieved. For instance, integrating speech
analysis and body language recognition could provide additional insights into community formation
patterns.
However, it should be noted that there are limitations in the research study: the dataset’s size
and diversity. Although gained insights from the provided data are substantial, a more extensive
dataset covering various contexts could offer a more comprehensive account of community detection
in face-to-face interactions.
Despite its current limitations, we believe that our study not only evaluates whether the Louvain
Algorithm is appropriate for detecting communities in face-to-face communication networks but also
contributes to the field by paving the way for further research on a wider scope.

References
[BBPSV18] A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani. Community de-
tection in temporal multilayer networks, revealing the social structure of face-to-face
interaction. In Multilayer Networks, 2018.

[BCT+ 14] A. Barrat, C. Cattuto, A. E. Tozzi, P. Vanhems, and N. Voirin. High-resolution tem-
poral networks of face-to-face human interactions. In Advances in Neural Information
Processing Systems, pages 2268–2276, 2014.

[BGLL08] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre.
Fast unfolding of communities in large networks. Journal of statistical mechanics: theory
and experiment, (10):P10008, 2008.

[CGP12] Michele Coscia, Fosca Giannotti, and Dino Pedreschi. A classification for community
discovery methods in complex networks. Statistical Analysis and Data Mining: The
ASA Data Science Journal, 4(5):512–546, 2012.

144
[CGar] P. Chejara and W. W. Godfrey. Comparative analysis of community detection algo-
rithms. Technical report, Department of ICT, ABV-Indian Institute of Information
Technology and Management Gwalior, India, Year.

[FB07] Santo Fortunato and Marc Barthelemy. Resolution limit in community detection. Pro-
ceedings of the National Academy of Sciences, 104(1):36–41, 2007.

[For10] Santo Fortunato. Community detection in graphs. Physics Reports, 486(3):75–174,


2010.

[HSS08] A. A. Hagberg, D. A. Schult, and P. J. Swart. Exploring network structure, dynamics,


and function using networkx. In Proceedings of the 7th Python in Science Conference
(SciPy2008), pages 11–15, Pasadena, CA, USA, 2008. NASA Ames Research Center.
[Hun07] J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engi-
neering, 9(3):90–95, 2007.
[KBSL21] S. Kumar, C. Bai, V. S. Subrahmanian, and J. Leskovec. Deception detection in group
video conversations using dynamic interaction networks. In Proceedings of the 15th
International AAAI Conference on Web and Social Media (ICWSM), 2021.

[KN11] Brian Karrer and Mark EJ Newman. Stochastic blockmodels and community structure
in networks. Physical Review E, 83(1):016107, 2011.

[KS88] David Krackhardt and Robert N Stern. Informal networks and organizational crises:
An experimental simulation. Social Psychology Quarterly, pages 123–140, 1988.
[Lea09] David Lazer and et al. Computational social science. Science, 323(5915):721–723, 2009.

[Lesnd] Jure Leskovec. Dynamic face-to-face interaction networks [dataset], n.d. SNAP:
Stanford Network Analysis Project. Retrieved August 2, 2023, from http://snap.
stanford.edu/data/comm-f2f-Resistance.html.

[MMP19] F. Morone, H. A. Makse, and M. T. Pisabarro. Community detection in face-to-face


proximity networks: A data-driven study. Journal of Statistical Physics, 175(6):1270–
1285, 2019.

[New10] Mark Newman. Networks: An Introduction. Oxford University Press, 2010.

[NG04] Mark EJ Newman and Michelle Girvan. Finding and evaluating community structure
in networks. Physical Review E, 69(2):026113, 2004.

[PBM20] S. Puglisi, F. R. Bullo, and A. V. Mantzaris. Stochastic block models and community
detection in face-to-face proximity networks. Applied Network Science, 5(1):1–24, 2020.

[PLR19] D. N. Peralta, E. S. Loslever, and A. Y. Ramos. Community detection in social networks.


Applied Network Science, 4(1):1–21, 2019.
[WF94] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Ap-
plications. Cambridge University Press, 1994.

10

145
Figure 4: The Sequential (standard) Louvain Algorithm code template.

11

146
Figure 5: In this image, there is a visualization that was obtained right after applying the code and
before clustering the data. Each color determines a particular community.

12

147
Figure 6: There, we are presented with a final graph that clearly shows all 34 communities (as
they were initially grouped). This visualization was obtained after employing clustering to the data
points.

13

148
Self-Supervised Dementia Prediction From MRI
Scans With Metadata Integration

Zile Huang
October 13, 2023

Abstract
We introduce metadata integration in the training process for demen-
tia diagnoses as weak label information using Weakly-Supervised Mod-
ified Knowledge Distillation with No Labels (WS-MDINO). Using WS-
MDINO, we fine-tuned the parameters of the original vision transformer
pre-trained with DINO on ImageNet. Our model achieved equivalent to
the state-of-the-art epoura rformance of 92% accuracy in the OASIS1
dataset under leave-one-out cross-validation. We visualized the perfor-
mance of the model by extracting average self-attention maps and average
brains from the dataset, showing that the model had learned meaningful
structural information about demented brains.

1 Introduction
Alzheimer’s Disease (AD) is a leading cause of dementia, affecting millions
worldwide. Even to date, it has no proper medical treatment and can only
be controlled with continuous medication [KMS+ 22]. Early diagnoses and early
intervention are beneficial for both the patients and caretakers, for the treat-
ment would be most effective and less costly [RL19]. An automated model
would aid the early detection of dementia immensely as it provides a fast,
cheap, and accurate reference for the diagnosing process. Past works have used
MRI scans of patients’ brains to develop image recognition models for AD diag-
noses [SN18, SMP+ 21, FDH+ 19, CGAA22, AR14, SJS+ 23, IZ18]. However, past
models have faced challenges such as poor interpretability, which is a symptom
of most deep learning and CNN architectures, and non-optimal integration of
clinically free metadata [SN18]. Many previous works failed to perform cross
validation because it is too computationally expensive (for each training split
the model needed to be re-trained completely) [FDH+ 19,IZ18]. To address these
limitations, we developed a model with a self-supervised method which can in-
corporate the metadata as weak labels [CZWM+ 22] with a vision transformer
(ViT) backbone [DBK+ 20].
∗ Advised by: Jan Cross-Zamirski

149
While many previous works used the Convolutional Neural Network (CNN),
we use a small ViT with 8 × 8 pixel patch size (ViT-S/8) introduced by Dosovit-
skiy et al. [DBK+ 20]. Compared to traditional CNN models, ViTs on medical
datasets have been shown to capture long-range relationships in the image,
provide built-in insight into the performance of the model with self-attention
maps, and provide superior adaptive-learning with the self-attention mecha-
nism [MHSS21].
Even though ViTs require a significantly larger dataset than CNNs to achieve
these qualities [MHSS21], researchers can perform transfer learning from the
pre-trained weights on ImageNet [DDS+ 09], which consists of millions of la-
beled images. Past work on automatic AD diagnoses using a ViT achieved an
overall accuracy of 83.27%, with 85.07% specificity and 81.48% sensitivity on
the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset [HKK23].
While the typical training methods for both CNNs and ViTs are supervised,
this paper uses an unsupervised training approach, WS-MDINO, a modified
version of the DINO [CTM+ 21] training method that integrates the metadata
of the subjects into the training process . We trained a multi-perception classifier
and a K-nearest-neighbor (KNN) classifier using the features extracted from the
ViT to produce the final prediction.
Compared to other models, our ViT trained with the WS-MDINO method:
• Is a multi-modal model that integrates the clinically available metadata of
the patients into the training process, achieving better overall performance
• Provides less noisy self-attention maps for the image data than supervised
ViTs [CTM+ 21]
• Allows more complicated validation methods, such as K-fold and leave-
one-out validation, without extra time and computing resources, because
the training process of the feature extractor is not supervised, and, thus,
does not require a train-test split.

2 Background
2.1 AD classification
There are many past efforts to use machine learning to diagnose AD in early
stages based on MRI scans. An early work in automatic diagnoses used Struc-
ture Tensor Analysis to extract features from the MRI scan and used Support
Vector Machine to classify stages of dementia [AR14]. It achieved 88.6% two-
class (demented, non-demented) accuracy, 87.6% sensitivity, and 84.8% speci-
ficity. While this method required relatively less computational resources as
it did not use neural network, it could not effectively integrate the clinically
free metadata into the training process. It also lacked three-class classification.
Later works used Convolutional Neural Network for the image recognition task.
Fulton et al. [FDH+ 19] took the center 51 slices of the axial plane of the 3-D

150
image and trained a ResNet50 model for three-class (non-demented, very-mildly-
demented, mildly-demented), achieving a 98.99% accuracy. However, this result
is not convincing as it did not use a K-fold validation and it is likely that the
slices of the same brain were assigned to both training and validation sets, caus-
ing data leakage. Using training-test split and 5-fold cross validation, we could
not reproduce the results listed in the paper. Islam et al. [IZ18] trained three
separate CNN models for each of the sagittal, coronal, and axial views of the
brain, and combined the prediction of each model using vote. Their proposed
model achieved 93% accuracy, 93% sensitivity, and 94% specificity. However,
they failed to use N-fold validation as they considered it too computationally
expensive, adding greater randomness to their performance. Newer studies in-
troduced the ViT approach [ZK22], achieving 86% accuracy on ADNI dataset
with convolutional voxel values as the input. Compared to CNN models, ViTs
had better interpretability and could capture more long-range relations in the
image.
A comprehensive review [WTSDM+ 20] about machine learning models in
AD classification presented the challenges faced in past classification works.
It showed that many works only did a train-test set split and did not per-
form cross validation, making their performance less convincing. It also showed
that many past works, such as the work we failed to reproduce [FDH+ 19], suf-
fered data leakage, knowingly or not, which caused inaccurate representation
of models’ performance. The review showed that many proposed performances
were not reproducible and, in fact, if with proper train-set split and valida-
tion method, most proposed models would be outperformed by Support Vector
Machine (SVM) with image score.

2.2 Machine Learning


2.2.1 Vision Transformer (ViT)
Inspired by transformers in Natural Language Processing (NLP), the ViT [DBK+ 20]
is a newer network architecture for computer vision. ViTs first split the inputted
image into small patches (the original paper provided 8 × 8 pixel patches and
16 × 16pixel patches, but other dimensions are possible). Patches are then lin-
early projected to a flattened vector and, along with learnable class tokens, fed
to the transformer encoder, which consists of multi-head attention layers and
multi-layer-perception (MLP) layers. A normalization layer is added to each of
the two main layers to improve performance and training efficiency. The multi-
head attention layer consists of multiple self-attention heads, whose outputs are
concatenated for the MLP layer. Each self-attention head can be visualized
with a self-attention map. All the embedding are fed to a final MLP classifier
for final classification. This structure is illustrated in Figure 1.
Compared to traditional CNN architectures, ViTs are more adaptive for
image distortion and can capture long-range relations. However, this comes at
a cost of heavy dependency on augmentations, hyper-parameter tuning, and
large datasets [MHSS21]. For medical datasets, which can be relatively small,

151
Figure 1: Vision Transformer architecture - figure from the original paper
[DBK+ 20]

researchers primarily use or fine-tune ViTs pretrained on ImageNet [MHSS21].

2.2.2 Knowledge Distillation with No Labels (DINO)


Caron et al. proposed Knowledge Distillation with No Labels (DINO) as a self-
supervised training scheme [CTM+ 21]. Similar to knowledge distillation, DINO
trains two networks, the student network gθs with parameters denoted as θs and
the teacher network gθt with parameters denoted as θt . DINO uses special data
augmentation that, for each image x, generates 2 global crops covering large
areas, denoted as xg1 and xg2 , and n local crops covering small areas, denoted as
V (n is a hyper-parameter). DINO feeds the teacher network only global crops,
and the student network global crops and local crops. DINO trains the student
network to maximize its agreement with the teacher network by minimizing the
Cross Entropy Loss:
 
Loss = −Pt (x)log(Ps (x′ )) (1)
x∈{xg,1 ,xg,2 } x′ ∈V
x′ ̸=x

Where P (x) represents the probability distributions for the output, the Tem-
perature Softmax:

exp (g(x)(i) /τ )
P (x)(i) = K (2)
(k) /τ )
k=1 exp (g(x)
Where K is the dimensionality of the output and τ is the temperature,
different for student and teacher, denoted as τs and τy (τ > 0). The teacher
parameters are updated with an exponential moving average (ema) based on
the student parameters:

152
θt ← λθt + (1 − λ)θs (3)
Where λ is the momentum hyper-parameter. While DINO also works with
other architectures such as ResNet, it performs best with a ViT backbone.
DINO with a ViT backbone presents clearer semantic segmentation information
than supervised ViTs and works excellently with k-NN classifiers using extracted
embeddings.

2.2.3 Weak Supervised form of DINO (WS-DINO)


Cross-Zamirski et al. proposed weak supervision during DINO (WS-DINO)
training using weak labels on medical datasets which have clinically free meta-
data [CZWM+ 22]. WS-DINO first creates a pseudo class for each image using
the metadata without using the real label. Different from DINO, WS-DINO
then sources local views V from images of the same pseudo class. Minimizing
the same loss function as Eq. 1, WS-DINO not only maximizes the agreement
between the teacher and student networks, but also maximizes the agreement
between images of the same pseudo class, therefore achieving weak supervision.
WS-DINO provides an elegant solution of integrating metadata into the
training process. Therefore, it is especially powerful for datasets with relevant
guiding metadata.

3 Methods
The implementation of our methods is available in a GitHub repository1 . We
summarize our training and evaluation in Figure 2.

3.1 Weakly-Supervised Modified DINO (WS-MDINO)


We propose Weakly-Supervised Modified DINO (WS-MDINO) training method
for brain images. WS-MDINO is an adaptation of the WS-DINO method on
brain images which allows the model to cluster subjects directly using weak la-
bels while preserving important brain features such as symmetry and alignment.
Given the great structural similarity between aligned brain images, we con-
sider random resized crop, the primary cropping method used in WS-DINO and
DINO, unsuitable for brain image, as the model would interpret the augmented
structural variation as a more significant factor than the actual information the
images carry.
Therefore, WS-MDINO feeds the teacher network one global view of image
x, denoted as vt , and the student network n global view of n different images of
the same pseudo class, denoted as Vs . It is trained to maximize the agreement
between images of the same pseudo class by minimizing the Cross Entropy Loss:
1 https://github.com/powerLEO101/WS-MDINO OASIS1

153
Figure 2: Summary of data preprocessing, training, feature extraction, and
evaluation pipeline

Table 1: Caption


Loss = −Pt (vt )log(Ps (x)) (4)
x∈Vs

3.2 Dataset analysis


We selected the OASIS1 dataset for our study [MWP+ 07]. The OASIS1 dataset
provides a cross-sectional collection of 436 subjects aged from 18 to 96. For each
subject, the dataset provides a 176 × 208 × 176 pixel 3-D image of the MRI scan
of each subject and a table of subjects’ metadata. We present the details and
completeness of the metadata in Table 2.
The CDR in the metadata is used as the ground-truth label, separating the
image data into 4 classes: 0 being healthy, 0.5 being very mildly demented, 1
being mildly demented, and 2 being moderately demented.

154
Column Name Data Completeness
Identification (ID) Complete
Gender (M/F) Complete
Dominant Hand (Hand) Complete
Education (Educ) Missing 201 rows
Socioeconomic Level (SES) Missing 220 rows
Mini Mental State Examination Score (MMSE) Missing 201 rows
Clinical Dementia Rating (CDR) Complete
Estimated Total Intracranial Volume (eTIV) Complete
Normalize Whole Brain Volume (nWBV) Complete
Atlas Scaling Factor (ASF) Complete
Delay Missing 416 rows

Table 2: Summary of OASIS Data Completeness

3.3 Data preprocessing


In the dataset, there are 336 healthy subjects, 70 very mildly demented subjects,
28 mildly demented subjects, and 2 moderately demented subjects. We merged
the mildly demented and moderately demented subjects into one class in-line
with other studies [FDH+ 19, SMP+ 21].
Our model was mostly unaffected by the missing data except for the MMSE
score. We noticed that all subjects without an MMSE score are non-demented.
Therefore, we automatically gave them a full score of 30 for their MMSE score,
signifying they are cognitively healthy [DPC17].
We created two kinds of pseudo class for each subject: 1) pseudo class only
using the MMSE score and 2) pseudo class using a combination of the MMSE
score and Age, the compound class. The detailed pseudo class can be found in
the csv file in our Github Repository.
The raw images for subjects are atlas-registered gain field-corrected, brain
masked, and re-sampled to 1mm isotropic voxels [MWP+ 07]. The dataset pro-
vides the processed file for this part of the preprocessing, which can be found in
the “T88 111” folder in the dataset. We took one middle slice of the sagittal,
coronal, and axial planes of the 3-D MRI image, converted them into arrays,
and stored them in separate files for each subject.
By taking the center three slices of each plane of the brain, we converted the
original 3-D images into 2.5-D images, which:

• Preserve the important features for diagnosing dementia (the ventricles


and hippocampus area).

• Allow greater compatibility with the existing computer vision architec-


tures and weights, such as ResNet and ViTs

• Have a much smaller size compared to the original dataset, demanding


less computational resources and decreasing the training time, compared
to 3-D models such as [SMP+ 21].

155
3.4 Data Augmentation
We used limited data augmentations to preserve important features of the
brains. After several trials with various data augmentations such as resizing
and translating, we concluded that, because each brain image is so structurally
similar to another, the model would interpret the noise caused by the data
augmentation a more significant information than the actual information the
images carry. For this reason, we found that models would generally perform
better on brain datasets like OASIS1 with little data augmentation.
Therefore, unlike the original DINO implementation2 , we avoided rotation
and translations to preserve the symmetrical structure of the brain; We avoided
color jitter, solarization, and Gaussian noise for the model to understand that
the input is single-channel, even though the gray-scale channel is copied into
RGB channels to fit the ViT structure and has a black background. For each
global crop, we resized the image to 256 × 256 pixels and centered cropped the
image to 224 × 224 pixels.

3.5 Network Details and Training


We trained separate models for each of the sagittal, coronal, and axial planes.
For each model we used the same hyper-parameters as follows: a ViT-S/8 back-
bone; each augmented image data gives one 224×224 pixel global crop and three
other 224 × 224 pixel global crops of images of the same pseudo class; teacher
momentum is 0.99; gradient norm for gradient clipping is 3.0; teacher tempera-
ture is 0.04 without warm-up; student temperature is 0.07; center momentum is
0.8; batch size is 4; weight decay is none; optimizer is adamW; warm-up epoch
is 10 epochs; learning-rate is 3e−6 to 2e−6 with a cosine scheduler; number of
total epochs is 40; any other parameters are the same as the original DINO im-
plementation. We initialized each model with the weights from DINO trained
on ImageNet.
We extracted 384 features from the ViT head for each plane of view and
combined them into a vector with 1152 elements. Finally we performed a Z-
score normalization on the combined feature vector.
The model was trained with a INTEL I7-12700K GPU and NVIDIA RTX3080
GPU. The total training took approximately two hours.

3.6 Evaluation, visualization, and interpretation


To evaluate the features extracted by our model, we trained a KNN classifier
(k=2) with leave-one-out cross validation. Leave-one-out cross validation is the
logical extreme of cross validation and is the most unbiased. We evaluate the
KNN classifier using 3-class accuracy, 2-class accuracy, sensitivity, and speci-
ficity.
To visualize the performance of our model, we performed a Principle Com-
ponents Analysis (PCA) and reduced the dimensionality to 2. We then plotted
2 https://github.com/facebookresearch/dino

156
Method 3-class Acc. 2-class Acc. Sensitivity/Specificity
ResNet50 76.9% 79.6% 46%/87%
WS-MDINO with MMSE labels 84% 89% 64%/96%
WS-MDINO withCompound labels 85% 92% 71%/98%
WS-MDINO with Real labels (CDR) 100% 100% 100%/100%
CNN with vote [IZ18] 3 N/A 93% 93%/94%
2-D CNN [SMP+ 21] N/A 84% N/A
3-D CNN [SMP+ 21] N/A 84% N/A
Forward Neural Network [JKK17] N/A 90% 92%/87%

Table 3: Comparison of our work to existing works

each subject in a 2D space with 3 different colors representing non-demented,


very mildly demented, and mildly demented and moderately demented in Figure
3, 4, and 5
To interpret our model, we extracted the self-attention maps for each subject.
We then took the average pixel value of self-attention maps of subjects in each
class and produced an average self-attention map for each class. Taking the
average pixel value, we also evaluated an average brain for subjects in each
class in Figure 6.

4 Results and discussion


4.1 Performance
We trained a ResNet50 on the same images set using 5-fold cross-validation as
our baseline model, a WS-MDINO model using the real labels (CDR) as a proof
of concept, a WS-MDINO model with the MMSE score as the pseudo class, and
a WS-MDINO model with compound classes using both MMSE and age. We
present our models’ performances with other existing best performing models
in Table 3.
We show that our model achieved the equivalent of state-of-the-art perfor-
mance using compound labels under leave-one-out cross validation, a stricter
metrics compared those of other works. It is worth noticing that, even though
the WS-MDINO trained with real labels achieved 100% accuracy, it is not a
valid network and only serves to be a proof of concept, because CDR is not a
weak label and would cause data leakage.

4.2 Class representation


We used PCA to reduce the dimensionality to 2 and scattered subjects with
three colors representing three classes (Purple = Non-demented, Mint = Very
Mildly Demented, Yellow = Mildly to Moderately Demented). We present the
class representation plots for three stages of the fine-tuning (no fine-tuning,
20 epochs of fine-tuning, and 40 epochs of fine-tuning) to show the clustering

157
process under weak supervision in Figure 3, 4, and 5. We show that the weak
supervision with weak labels is effective as subjects cluster over time in the 2-D
representation. It is worth noticing that, even though we used 7 weak labels
in total, our model clustered subjects into roughly 4 clusters. This shows that,
while our model learns from the weak labels, it also effectively learns from the
similarities between images of the same class.

4.3 Average self-attention maps and brains


The self-attention map is a powerful feature of ViTs which visualizes the weights
of the self-attention heads. Such visualization is very useful as it shows the
features that the model has learned, which simplify the fine-tuning process and
allow the researchers to interpret the model. We present average self-attention
maps from the fine-tuned weights and average brain for each of the sagittal,
coronal, and axial views in Figure. 6.
Using average self-attention maps and brains, we show that our model has
learned meaningful information from brains with different stages of dementia.
In general, demented subjects have shrunken hippocampus and cerebral cortex,
and enlarged ventricles [IZ18]. In Figure 6, we show that our model successfully
captured the aforementioned features for demented subjects. From the sagit-
tal and coronal views, we show that the area around the hippocampus is the
most highlighted by the attention map. From the axial view, we show that the
ventricle is the most highlighted and the area around it, the cerebral cortex,
is also more highlighted than other structures. Thus, we show that our model
has learned the significance of the hippocampus, cerebral cortex, and ventricles
in the diagnoses of dementia. It is also notable that the highlighted areas for
demented subjects are generally dimmer than those for non-demented subjects,
signifying our model has learned the structural difference between demented
and non-demented brains.

5 Conclusion
WS-MDINO is a powerful method of integrating metadata as weak supervi-
sion for DINO training, allowing models to learn effectively from both images
and metadata. Capable of generating 3-class and 2-class predictions with high
accuracy, our dementia diagnosis model trained using WS-MDINO with com-
pound weak labels successfully captures important features of demented and
non-demented brain. Our model also provides insight into weakly supervised
training methods for datasets that are sensitive to data augmentations, such as
brain MRI scan datasets.
While the OASIS1 dataset used in this study is a relatively small dataset,
there are larger datasets for dementia diagnoses such as ADNI4 which consists
of thousands of subjects. Future work should encompass testing our method on
such larger datasets. However, this is beyond the scope of this study. It is also
4 https://adni.loni.usc.edu/

10

158
possible that a stronger pseudo class could further improve the model’s perfor-
mance. However, it is important that the pseudo classes do not have dataset-
specific information, which decreases the model’s generalization ability. Thus,
we suggest building pseudo classes as simple as possible and following existing
studies on metadata’s influence on the subject, such as [RSH+ 13]. For future
work, WS-MDINO has the potential to seamlessly combine machine learning
approaches with classical approaches, which are reflected in the creation of one
or multiple pseudo classes.

Figure 3: Class representation in 2-D space using ImageNet weights with no


fine-tuning

Figure 4: Class representation in 2-D space using ImageNet weights fine-tuned


with WS-MDINO with compound label after 20 epochs

11

159
Figure 5: Class representation in 2-D space using ImageNet weights fine-tuned
with WS-MDINO with compound label after 40 epochs

Figure 6: Average attention maps and brains from sagittal, coronal, and axial
view, produced from WS-MDINO using compound labels (left: average brain;
right: average self-attention map)

12

160
References
[AR14] M Archana and S Ramakrishnan. Detection of alzheimer dis-
ease in mr images using structure tensor. In 2014 36th Annual
International Conference of the IEEE Engineering in Medicine
and Biology Society, pages 1043–1046, 2014.

[CGAA22] Kwok Tai Chui, Brij B. Gupta, Wadee Alhalabi, and Fatma Salih
Alzahrani. An MRI scans-based alzheimer’s disease detection via
convolutional neural network and transfer learning. Diagnostics,
12(7):1531, June 2022.

[CTM+ 21] Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou,
Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging
properties in self-supervised vision transformers, 2021.

[CZWM+ 22] Jan Oscar Cross-Zamirski, Guy Williams, Elizabeth Mouchet,


Carola-Bibiane Schönlieb, Riku Turkki, and Yinhai Wang. Self-
supervised learning of phenotypic representations from cell im-
ages with weak labels, 2022.

[DBK+ 20] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk


Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa De-
hghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob
Uszkoreit, and Neil Houlsby. An image is worth 16x16 words:
Transformers for image recognition at scale, 2020.
[DDS+ 09] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and
Li Fei-Fei. ImageNet: A large-scale hierarchical image database.
In 2009 IEEE Conference on Computer Vision and Pattern
Recognition. IEEE, June 2009.

[DPC17] Silvia Duong, Tejal Patel, and Feng Chang. Dementia. Cana-
dian Pharmacists Journal / Revue des Pharmaciens du Canada,
150(2):118–129, February 2017.
[FDH+ 19] Lawrence Fulton, Diane Dolezel, Jordan Harrop, Yan Yan,
and Christopher Fulton. Classification of alzheimer’s disease
with and without imagery using gradient boosted machines and
ResNet-50. Brain Sciences, 9(9):212, August 2019.

[Gre93] George D. Greenwade. The Comprehensive Tex Archive Net-


work (CTAN). TUGBoat, 14(3):342–351, 1993.

[HKK23] Gia Minh Hoang, Ue-Hwan Kim, and Jae Gwan Kim. Vision
transformers for the prediction of mild cognitive impairment to
alzheimer’s disease progression using mid-sagittal sMRI. Fron-
tiers in Aging Neuroscience, 15, April 2023.

13

161
[IZ18] Jyoti Islam and Yanqing Zhang. Brain MRI analysis for
alzheimer’s disease diagnosis using an ensemble system of deep
convolutional neural networks. Brain Informatics, 5(2), May
2018.

[JKK17] Debesh Jha, Ji-In Kim, and Goo-Rak Kwon. Diagnosis of


alzheimer’s disease using dual-tree complex wavelet transform,
PCA, and feed-forward neural network. Journal of Healthcare
Engineering, 2017:1–13, 2017.

[KMS+ 22] C. Kavitha, Vinodhini Mani, S. R. Srividhya, Osamah Ibrahim


Khalaf, and Carlos Andrés Tavera Romero. Early-stage
alzheimer's disease prediction using machine learning models.
Frontiers in Public Health, 10, March 2022.

[MHSS21] Christos Matsoukas, Johan Fredin Haslum, Magnus Söderberg,


and Kevin Smith. Is it time to replace cnns with transformers
for medical images?, 2021.

[MWP+ 07] Daniel S. Marcus, Tracy H. Wang, Jamie Parker, John G. Cser-
nansky, John C. Morris, and Randy L. Buckner. Open access
series of imaging studies (OASIS): Cross-sectional MRI data in
young, middle aged, nondemented, and demented older adults.
Journal of Cognitive Neuroscience, 19(9):1498–1507, September
2007.

[RL19] Jill Rasmussen and Haya Langerman. Alzheimer’s disease why


we need early diagnosis. Degenerative Neurological and Neuro-
muscular Disease, Volume 9:123–130, December 2019.

[RSH+ 13] Tom C. Russ, Emmanuel Stamatakis, Mark Hamer, John M.


Starr, Mika Kivimäki, and G. David Batty. Socioeconomic status
as a risk factor for dementia death: individual participant meta-
analysis of 86 508 men and women from the UK. British Journal
of Psychiatry, 203(1):10–17, July 2013.
[SJS+ 23] Hyunji Shin, Soomin Jeon, Youngsoo Seol, Sangjin Kim, and
Doyoung Kang. Vision transformer approach for classification of
alzheimer’s disease using 18f-florbetaben brain images. Applied
Sciences, 13(6):3453, March 2023.

[SMP+ 21] Cristina L. Saratxaga, Iratxe Moya, Artzai Picón, Marina


Acosta, Aitor Moreno-Fernandez de Leceta, Estibaliz Garrote,
and Arantza Bereciartua-Perez. MRI deep learning-based solu-
tion for alzheimer’s disease prediction. Journal of Personalized
Medicine, 11(9):902, September 2021.

14

162
[SN18] Lauge Sørensen and Mads Nielsen. Ensemble support vector ma-
chine classification of dementia using structural MRI and mini-
mental state examination. Journal of Neuroscience Methods,
302:66–74, May 2018.

[WTSDM+ 20] Junhao Wen, Elina Thibeau-Sutre, Mauricio Diaz-Melo, Jorge


Samper-González, Alexandre Routier, Simona Bottani, Di-
dier Dormont, Stanley Durrleman, Ninon Burgos, and Olivier
Colliot. Convolutional neural networks for classification of
alzheimer's disease: Overview and reproducible evaluation. Med-
ical Image Analysis, 63:101694, July 2020.
[ZK22] Zilun Zhang and Farzad Khalvati. Introducing vision trans-
former for alzheimer’s disease classification task with 3d input,
2022.

15

163
Sexual Dimorphic Nature of the Amygdala and
its Contribution to Females’ Susceptibility to
Depression

Nardos Shewadeg Gebresenbet
October 12, 2023

Abstract
Depression is about twice as common in females as it is in males,
which raises questions about the root of this significant disparity. Numer-
ous studies have examined the role female hormonal changes at various
stages of life, including the pre-menstrual, prenatal, and postnatal phases,
have in this phenomenon. However, little emphasis is placed on the lim-
bic system’s contribution, particularly the amygdala. The processing of
affective information takes place mostly in the amygdala, which is why
affective disorders like depression have a significant impact on this brain
structure. Furthermore, the amygdala is a subject of interest in studies
of sex differences in the human brain due to its high concentration of sex
hormone receptors. The recent developments of neuroimaging technolo-
gies also provide an opportunity to examine the distinct functions of the
amygdala in males and females. This review’s objective is to investigate
the characteristics that make the amygdala a sexually dimorphic brain
structure by placing a particular emphasis on its volume and function.
It will also cover how the amygdala’s sexually dimorphic characteristics
contribute to the prevalence of depression in women.

1 Introduction
The limbic system comprises various brain areas crucial for processing emotional
memory, along with motivation, social processing, learning, and spatial memory
(Har00). The hippocampus and the amygdala are at the forefront of emotion
regulation, with the amygdala processing emotion and the hippocampus creat-
ing a declarative episodic memory of the emotional event (RL04). The amygdala
has received attention due to its essential role in processing emotionally salient
information and developing adaptive responses subsequently (OB15). It is eas-
ily distinguishable in the temporal lobe due to the almond-shaped nucleus it
∗ Advised by: Dr.Bridget Callaghan (Assist. Professor)

164
contains (All21) and the fact that it has 13 nuclei total (Jan15; oMN). The so-
phisticated neuronal connections between the amygdala and other regions of the
brain also make it a crucial structure in controlling both behavioral and physio-
logical responses (Ham05). Therefore, learning about the amygdala’s structure
and function is worthwhile since any damage would disrupt many neurological
processes.
Numerous studies have asserted that the amygdala is a sexually dimorphic
structure (Nis81; Coo05; Uem12; Blu17; LO21) nevertheless, others have argued
that the difference between males and females is not that significant (Fra00;
Mar17). This review will focus on traits that are considered to make the amyg-
dala a sexually dimorphic organ, notwithstanding the controversy. Amygdala’s
volume will be the first characteristic discussed in this review. Across all ages,
males have a 9–12% larger average brain size than females (Kac19). This over-
all difference also correlates to variances in volume in individual brain areas,
such as the amygdala (Mar17). Other factors, in addition to intracranial vol-
ume disparities, make the sex-related size difference in the amygdala apparent.
For example, the amygdala’s high number of sex hormone receptors renders it
highly influenced by sex hormones such as androgen and estrogen, which play
distinct roles in its volume (Ham05). Furthermore, the peak of amygdala de-
velopment differs between males and females (Uem12). The aforementioned
factors contribute to the alleged amygdala volume differential between males
and females. Little is known about the functional significance of the volume
differential. However, as neuroimaging technologies advance, there is room for
discovery.
The function of the amygdala is another feature that makes it a sexually di-
morphic organ. To begin with, the amygdala’s primary role in the nervous sys-
tem is processing threatening, fear-inducing stimuli, and activating fear-related
behaviors to produce physiological and psychological responses (ˇSi21). It is
also rarely activated and generates responses during emotionally neutral stimuli
(Dav02). The functioning of the amygdala has revealed differences between the
two sexes. The discrepancy is particularly evident in its response and hemi-
spheric lateralization (Ham05). Although little is known about the clinical
consequences of this differential, some studies have shown promise for future
discovery (Sim14).
Apart from its sexual dimorphism, the amygdala’s considerable participation
in affective information processing (ˇSi21) is an appreciable trait. It helps
to categorize sensory data and assign them the proper degree of relevance to
elicit a response to the emotionally significant ones (ˇSi21). In light of the
characteristics mentioned, the amygdala is the most afflicted brain structure in
psychopathologies such as depression. Depression is one of the most common
mental disorders (Kal20). Many studies have been conducted to determine the
causes of this mental condition (Dav02; Fu09), and most of them have discovered
greater amygdala involvement (Rub16; Zha21; ˇSi21). As a result, the amygdala
is at the forefront of investigations into the neurological etiologies of depression.
Given the amygdala’s role in depression and its sexually dimorphic trait, it
is a crucial brain structure to examine when discussing females’ propensity to

165
depression. According to current statistics, depression is 50% more common
in women than men (Org23). Many studies have attempted to investigate the
causes of this immense gap (Alb15; Li17; Kue17). However, their focus was on
the alterations that occur during distinct stages of female life, such as menstrua-
tion, pregnancy, and menopause, with minimal emphasis on sexually dimorphic
brain structures such as the amygdala. The primary goal of this review is focus-
ing on defining the characteristics that distinguish the amygdala as a sexually
dimorphic organ, and demonstrating how they can be relevant in determining
the causes of female susceptibility to depression.

2 Amygdala
2.1 Structure of the amygdala
The amygdala is a brain structure in the temporal lobe that is part of the limbic
system (OB15). It contains neuronal cells that transport electrical and chemical
impulses, as well as glial cells that support the neuronal cells (Cli23). The amyg-
dala was named after the Greek word for almond because of the almond shape
of its basal nuclei (All21). Also, due to its almond shape, it can be clearly
distinguished in its anatomic position, which is anterior to the hippocampus
(OB15). Although it varies depending on the overall size of the brain, it is a
small structure placed near regions that transport information from the senses
(OB15). It is a paired structure, with one in the left hemisphere and the other
in the right (OB15). It has 13 nuclei that fall into four major categories: the ba-
solateral group (which includes lateral, basal, accessory basal, and para laminar
nuclei); the superficial (which encompasses centro medial and cortical nuclei);
the medial and central (which share functional similarities but have distinct
roles at times), the anterior amygdaloid area, and the amygdalohippocampal
area (ˇSi21). Among these nuclei, the basolateral nuclei, which emerges from
the lateral amygdala, plays a predominant role in the whole amygdala (ˇSi21).
It sends efferent projections to other amygdala nuclei and other cortical areas
(ˇSi21).

2.2 Function of the amygdala


The limbic system is a group of brain regions involved in information process-
ing, memory storage and retrieval, setting emotional states, and connecting the
conscious and unconscious activities of the brain (Har00). The amygdala is
a prominent component of the limbic system that plays a crucial role in the
processing of emotional information (ˇSi21). It is mainly engaged in recogniz-
ing fearful and threat-inducing stimuli and activating physiological responses to
them (OB15). As a result, it is at the forefront of emotional learning and behav-
ior because it assists the nervous system in forming adaptive responses when an
individual is exposed to these stimuli (OB15). The fear the amygdala mediates
can be both innate and learned (Iso15). The amygdala is crucial for regulat-

166
ing emotions and producing adaptations due to its complex interactions with
sensory modalities, which are vital for processing both types of fears (Iso15).
Emotion regulation refers to the process by which the amygdala determines risks
at the unconscious level and modulates behavioral and physiological responses
at the cognitive level (ˇSi21). Furthermore, the amygdala is prominently impli-
cated in both negative and positive valence emotion encoding, following which
it assigns a label to each emotion (ˇSi21). It generates reactions to emotionally
relevant ones after assigning labels (ˇSi21). The categorical model assumed
that the amygdala was primarily involved in negative emotions; however, ad-
vances in neuroimaging techniques revealed that the amygdala is also involved
in emotionally neutral stimuli (Bon15). It can be inferred that the amygdala is
only marginally involved in emotionally pleasant stimuli, but its involvement is
prominent in emotionally unpleasant stimuli. Besides, the amygdala has sophis-
ticated neural connections with other brain parts, namely sensory structures,
and brain regions such as the hippocampus and hypothalamus (Ham05). This
link is especially crucial in information processing between the prefrontal cortex
and hypothalamus (oMN), and memory formation in the hippocampus. Also,
it processes diverse types of emotions through its connections with structures
engaged in the senses (Cli23). As a result, abnormalities in the amygdala’s
functioning may result in difficulty with proper emotion regulation, which leads
to psychopathologies.
Furthermore, the amygdala is one of the brain areas that exhibit lateral-
ity in functioning (Mar99). According to Mar99, in an experiment where the
cerebral blood flow variations in response to emotional valence were evaluated,
unpleasant stimuli substantially stimulated the left hemisphere of the amygdala.
Meanwhile, the right amygdala was involved in the recovery, non-detailed, and
shallow processing of emotional information (Mar99). This demonstrates that
in addition to playing similar roles in emotion processing (All21), the two hemi-
spheres of the amygdala have distinct functions in how emotion is processed.

3 Sexual dimorphic nature of the amygdala


Sexual dimorphism is defined as a ”distinct difference in size or appearance
between the sexes of an animal in addition to the sexual organs themselves”
(DO223). One of the supposed sexually dimorphic organs is the human brain.
A study of 48 healthy individuals, 21 females and 27 males of the same age,
educational background, ethnicity, socioeconomic situation, handedness (right),
and reading level, concluded that there are variations in the brains of males
and females (Gol01). The study also revealed that the difference is more pro-
nounced in brain regions with higher levels of sex hormone receptors; it is also
worth noting that earlier exposure to sex hormones contributes to brain sex-
ual dimorphism (Gol01). Considering cerebral size disparities, the study Gol01
found that males have larger volumes in limbic and paralimbic regions, such
as the amygdala and the hippocampus, since they contain more sex hormone
receptors.

167
Many studies on sex differences in the human brain have focused on the
amygdala (Nis81; Coo05; Uem12; Blu17; LO21). Other studies have claimed
that the amygdala is not a sexually dimorphic organ (Fra00; Mar17). Regard-
less of the debates, it is critical to examine sexual dimorphic features because
they may have ramifications for various psychopathologies. For example, a
study in rats discovered that exposure to testosterone in the neonatal period
triggers synaptogenesis during postnatal development, resulting in variations in
female and male amygdala’s later in life (Nis81). Furthermore, another study by
Coo05 on rats revealed that gonadal steroid hormones have a strong influence
on the medial amygdala early in life, and it is also lateralized before puberty.
After gonadal steroid hormone exposure, female rats had 80% more excitatory
synapses in the left hemisphere of the medial amygdala than males (Coo05).
Additionally, the amygdala’s increased number of sex hormone receptors makes
it sexually dimorphic in adulthood due to the hormones ingested during the
neonatal period (Gol01). The Amygdala’s sexual dimorphism can be seen in its
volume and function.

3.1 Amygdala’s volume difference between males and fe-


males
Male brains are 9-12% larger than female brains across all ages, and this total
intracranial difference is also attributable to individual brain structures such
as the amygdala (Mar17). However, according to Fra00 and Mar17, the size
difference is only evident when the total brain size difference is not considered;
hence they claim that the size difference is insignificant. On the other hand, a
study conducted by Uem12 on 109 healthy individuals ranging in age from 1
month to 25 years old, with 57 males and 52 females, discovered that amygdala
volume is higher in males. Furthermore, as reviewed in Kac19, after correcting
for intracranial volume, 400 subjects aged 8 to 30 years revealed higher amygdala
volume in males compared to females.
Several factors can contribute to the size difference between the sexes, of
which is the male and female sex hormones. As previously stated, the amygdala
contains sex hormone receptors (Ham05), making it susceptible to those hor-
mones. As an illustration, a study conducted on sixty-day-old male and female
rats by Coo99 revealed that castration of testosterone in male rats resulted in
equal volumes with female rats. Moreover, the study by Coo99 claimed that
the volume difference between males and females relies on the gonadal hormone
androgen, and the difference is evident before puberty, though to a smaller ex-
tent than in adulthood. Another study by Wan19 on 563 healthy adults, 250
males, and 313 females, found that before the peak of amygdala development,
males’ amygdala growth fit the quadratic model, while females suited the linear
model. This study Wan19 indicated that sex hormones play a substantial role
in the pace of expansion and decline of the amygdala’s volume.
Aside from sex hormones, the peak at which the amygdala matures differs
between males and females, which contributes to the size differential. According
to a study by Uem12, females reached the local maximum volume one and a half

168
years earlier than males. In addition, females had a slower growth rate, which
contributed to a smaller amygdala volume (Uem12). The peak age also differed
between the right and left amygdala; for males, the right amygdala peaked at
12.6 while the left peaked at 11.1; for females, the right amygdala peaked at
11.4, and the left peaked at 9.6. Further, the study by Uem12 found that before
the peak, the size difference between males and females was not significant, but
after, there was a substantial difference.
The functional importance of the size differential between males and females
is yet unknown (Ham05), but certain studies have shown promise. For instance,
according to a study by Qin16 conducted on 176 people between the ages of 19
and 30, 100 of whom were males and 76 of whom were females, whole brain size
as well as intracranial brain structure size were substantially associated with
function. It can be deduced from this that, with developments in neuroimaging
techniques such as functional magnetic resonance imaging (fMRI) and positron
emission tomography (PET), functional significance of the amygdala size differ-
ence between males and females may be identified.

3.2 Amygdala’s function difference between males and fe-


males
The amygdala is a crucial brain structure for processing emotions, particularly
threatening and frightening ones (ˇSi21). The mechanism by which these emo-
tions are processed in the amygdala differs between men and women (Ham05).
For example, a study conducted on female and male rats by Blu17 discovered
a significant sex difference in the activity of the basolateral amygdala (BLA).
Females had higher excitatory synaptic inputs to the lateral and basal nuclei
of the BLA, which was especially noticeable during the estrus cycle (Blu17).
Females with higher BLA activities had a stronger glutamatergic drive, and
there was also increased presynaptic protein synaptophysin in BLA (Blu17).
This resulted in increased lateral amygdala-dependent cued freezing and basal
amygdala-dependent contextual freezing (Blu17). According to the study by
Blu17, the increased activity of the BLA in females is accountable for the phys-
iological difference between female and male amygdala. Additionally, in a study
conducted on 14 adult females by Lis12, the neuropeptide oxytocin boosted
amygdala activation in response to threatening faces. The hormone oxytocin is
widely known for its powerful effects on lowering amygdala reactivity in threat
processing in males; however, the study Lis12 demonstrated the converse to be
true in females. The above differences in amygdala functioning between males
and females can be observed in its response to unpleasant stimuli and hemi-
spheric lateralization during emotion processing.
The amygdala is significantly involved in our bodies’ behavioral and physio-
logical responses to aversive stimuli (ˇSi21). The amygdala’s response to these
stimuli differs between males and females (Ham05). According to a study by
And14 conducted on 45 healthy adults between the ages of 18 and 35, with
16 males and 29 females, there were disparities in amygdala reactivity between
males and females. In the study And14, both male and female volunteers viewed

169
evocative visuals varying in novelty and valence, and females’ amygdala response
was more persistent to negative valence visuals over repeated trials. According
to the study by And14, persistent responses in females relate to a negative ef-
fect, which has implications for affective disorders. Another study Can02 on
12 females and males found that females remembered emotional occurrences
more. In the study Can02, both males and females were exposed to neutral
and emotionally negative images; their brains were recorded using fMRI; and
their memories were assessed three weeks later. To begin with, both males and
females remembered more emotionally aversive images, and females remem-
bered the emotionally aversive images better than males, with more vivid and
intense recollections (Can02). According to the findings above, the difference
in amygdala response to aversive stimulus between males and females can be
demonstrated by a more persistent amygdala response as well as a stronger
recollection of aversive events in females.
The amygdala is one of the brain regions that has shown laterality, with the
left and right hemispheres being involved in information processing in distinct
ways (Mar99). This laterality has also been demonstrated to differ by sex, with
females’ and males’ left and right amygdala participating in emotion process-
ing differently (Ham05). Cah04 conducted a study on 11 males and females in
which both female and male subjects underwent fMRI scan while viewing slides
ranging from emotionally neutral to extremely arousing. When memories were
assessed two weeks later, males showed a stronger activation of the right hemi-
sphere of the amygdala, whereas females showed a greater involvement of the
left hemisphere (Cah04). Besides, the blood oxygen level-dependent (BOLD)
signal was detected in the left hemisphere of the amygdala in females and the
right hemisphere in males (Cah04). Another study Sch11 included 235 male
and female adolescents who were age and handedness matched. The subjects
completed an emotional face perception fMRI test, and the results revealed that
boys had greater right amygdala involvement (Sch11). According to the study
by Sch11, sex-dependent hemisphere lateralization in teens is a forerunner for
emotional memory in adulthood. It also suggested that hemispheric lateraliza-
tion could have implications for the etiology of mental disorders (Sch11).

4 Amygdala’s involvement in depression


Depression is one of the most common psychopathologies, affecting around
280 million people worldwide, with females being twice as afflicted as males
(Org23). A mix of psychological, social, and biological factors can contribute to
it (Org23). Because of its involvement in emotion processing, the amygdala is
thought to play a key role in depression (Rub16; ˇSi21). ”Feelings are conscious,
emotional experiences of these activations that contribute to neuronal networks
mediating thoughts, language, and behavior, thus enhancing the ability to pre-
dict, learn, and reappraise stimuli and situations in the environment based on
previous experiences” (ˇSi21, p.1]). Consequently, feelings or emotional expe-
riences have a significant impact on people’s lives. Since the amygdala is the

170
key brain structure responsible for processing emotions and feelings, any distur-
bance in the amygdala could result in a fault in cognitive reasoning, making us
vulnerable to psychopathologies like depression. Amygdala hyperactivation has
been identified in major depressive disorder (MDD) (Rub16; Zha21;ˇSi21). The
amygdala hyperactivation may potentially influence the initial judgment as well
as the response to incoming information, resulting in cognitive biases towards
unpleasant or emotionally salient information (Dav02). This mechanism is also
hypothesized to be caused by norepinephrine, which is found at abnormally ele-
vated levels in MDD (Dav02). Norepinephrine is involved in amygdala-mediated
learning and is impacted by glucocorticoid secretions, which are similarly ele-
vated in depression (Dav02).
A postmortem study conducted on 13 people with MDD and 10 healthy
controls by Rub16 revealed differences in amygdala structure. Depressed par-
ticipants exhibited a larger lateral nucleus and more total BLA neurovascular
cells than controls (Rub16). This study also has important implications for
how structural disturbances in the amygdala lead to depression. Another study
Ram14 included 55 patients with MDD who met the DSM-IV criteria and 19
healthy controls. The subjects underwent a 3-T fMRI scan, and those with
MDD exhibited impaired intrinsic connectivity with other brain areas involved
in emotion processing and regulation (Ram14). According to the study, these
reduced intrinsic connections of the amygdala may be one of the causes of de-
creased sensations and perceptions, which leads to cognitive disturbances in
MDD (Ram14). The studies reviewed above provide concrete evidence for the
critical role of the amygdala in depression.

5 New Idea: The amygdala’s sexual dimorphism


and its role in females susceptibility to depres-
sion
Women are more prone to depression than men since they experience it twice
as often (Org23). It can be due to different causes, including hormonal changes
at critical life stages such as menstruation, pregnancy, and menopause (Kue17;
Li17). However, the neurological origins of this tendency have received little at-
tention. According to Gol01, the human brain is sexually dimorphic, with some
of its regions demonstrating anatomical and functional differences between males
and females. The difference is especially noticeable in brain areas with a high
concentration of sex hormone receptors, such as the amygdala (Gol01; Ham05).
The amygdala has shown sexual dimorphism in volume and function, and stud-
ies reporting the difference have also argued that this has clinical implications
(Sch11; Uem12).
The difference in amygdala volume between males and females can be seen in
males having a greater amygdala volume than females (Uem12; Kac19). Though
studies have not determined the functional or clinical importance of the differen-
tial, it is worth noting that certain studies have discovered that larger or smaller

171
amygdala volumes have induced psychopathologies (Xu20; Zha21). Given that
the amygdala is the main structure engaged in emotion processing and is in-
volved in depression (Dav02; Si21), any volumetric variation between males and
females can have medical consequences. As an example, one of the factors driv-
ing hyperactivation of the amygdala in females during producing responses to
unpleasant stimuli (Dav02; Ham05) could be a smaller volume of the amygdala,
although further research is needed to prove it. The smaller volume of the amyg-
dala, which results in fewer neuronal and glial cells, may disturb the amygdala’s
normal function in emotion processing. This may make women more prone to
faulty and disrupted cognitions, resulting in depression. Moreover, the differ-
ence in the peak of amygdala maturation between males and females is notable.
Since males reach their peak of maturity one and a half years later (Uem12),
their amygdala could have a higher potential to form efficient connections with
other brain regions. As a result, earlier amygdala maturation in females may
have a negative impact on emotion regulation and render them more susceptible
to depression.
Along with the volume difference, the functional difference in the amyg-
dala of males and females may increase female depression susceptibility. The
amygdala’s response and hemispheric lateralization are manifestations of the
differential. Primarily, females’ amygdala exhibits a persistent response to fear
or unpleasant stimuli compared to males (And14), and this response could re-
sult in amygdala hyperactivation. Thus, hyperactivation may cause a higher
metabolism in the body, resulting in needless physiological reactions and mal-
adaptive cognitions that lead to depression. Furthermore, women may be pre-
disposed to depression due to their more vivid and powerful recall of emotionally
charged memories (Dav02). Women employ their left hemisphere during emo-
tional memory processing (Cah04), and the left hemisphere is generally involved
in the detailed processing of aversive stimuli (Mar99), which can lead to the for-
mation of solid memories in women. Emotional memories are more important
to individuals and will be recalled more frequently than other memories. These
memories may also bring back the pain alongside the negative thoughts felt and
disrupt females’ mental health. It may also result in physiological and behav-
ioral responses such as insomnia or hypersomnia and social isolation.

6 Discussion
6.1 Implications
This review paper explored the amygdala’s sexual dimorphic characteristics and
their role in female depression vulnerability. To begin with, it defined the amyg-
dala by describing its anatomy and function, along with discussing the amyg-
dala’s role in depression. It demonstrated some of the amygdala’s sexually
dimorphic characteristics. The amygdala’s volume was the first attribute exam-
ined in this review, and it stated that females have a smaller amygdala volume.
Also, it implied that the decreased volume would result in fewer neuronal and

172
glial cells, obstructing proper functioning and triggering amygdala hyperactiv-
ity when exposed to adverse stimuli. The review also looked at the amygdala’s
earlier maturity in females. It suggested that this occurrence might render the
female amygdala less effective in connection with itself and other brain struc-
tures. This may jeopardize emotion regulation and predispose women to depres-
sion. The function of the amygdala was the other sexually dimorphic feature
examined. The first functional difference observed was in the amygdala’s sensi-
tivity to unpleasant stimuli, with females exhibiting a more persistent response.
According to the review, this may cause unnecessary physiological reactions in
our bodies, resulting in maladaptive cognitions. In addition, this review looked
at females’ acute and vivid recall of emotional memories. It also claimed that
women’s left hemisphere lateralization while processing emotion may contribute
to this. In essence, emotional memories will be more retained in females, bring-
ing back the pain or bad sentiments they had at the time. As a result, it will
have a negative impact on their mental health, rendering them susceptible to
depression.

6.2 Limitations
One limitation of this review was the scarcity of studies on the sexual dimorphic
feature of the amygdala. This narrowed the scope of the review and limited the
reasons for sexual dimorphism to the effects of sex hormones. Furthermore,
the focus of this review is the amygdala’s sexual dimorphism as a neurological
etiology for female depression vulnerability. However, since the amygdala has
a high density of sex hormone receptors, it is difficult to determine whether
its effects are neurological or sex hormone influenced. Another limitation is
that the notions stated in this review are only suggestions based on the studies
done so far. As a result, numerous studies are needed to determine whether
these may be plausible linkages between the amygdala’s sexual dimorphism and
female depression susceptibility.

6.3 Future directions


According to the studies discussed in this review, the amygdala is a sexually di-
morphic brain structure. As neuroimaging techniques advance, more extensive
and thorough investigations of the amygdala’s sexual dimorphism will be pos-
sible. Researchers will then be able to explore the therapeutic ramifications of
this trend once they have an in-depth understanding of the similarities and dif-
ferences between the amygdala of males and females. Thus, the amygdala may
be of interest in researching the neurological origins of females’ susceptibility to
depression.

10

173
7 Conclusion
This review aimed to investigate the amygdala’s sexual dimorphism and discover
how it affects women’s susceptibility to depression. As previously mentioned,
the amygdala exhibits sexually dimorphic characteristics in its volume and func-
tion. Nevertheless, more research is needed to have a broad understanding of
the extent of the discrepancies. Additionally, because the amygdala plays a role
in depression, it is imperative to have an extensive grasp of the distinctions
when examining the significant sex disparities in the prevalence of depression.

References
[Alb15] P. R. Albert. Why is depression more prevalent in women? Journal
of Psychiatry and Neuroscience, 2015.

[All21] Bobnar H. J. Kolber B. J. Allen, H. N. Left and right hemispheric


lateralization of the amygdala in pain. Progress in Neurobiology, 2021.
[And14] Dickerson B. C. Barrett L. F. Andreano, J. M. Sex differences in
the persistence of the amygdala response to negative material. Social
Cognitive and Affective Neuroscience, 2014.

[Blu17] Freedberg M. Vantrease J. E. Chan R. Padival-M. Record M. J. De-


Joseph M. R. Urban J. H. Rosenkranz J. A. Blume, S. R. Sex- and
estrus-dependent differences in rat basolateral amygdala. The Journal
of Neuroscience, 2017.

[Bon15] Comte A. Tatu L. Millot J.-L. Moulin T.- Medeiros De Bustos E. Bon-
net, L. The role of the amygdala in the perception of positive emotions:
An ”intensity detector.”. Frontiers in Behavioral Neuroscience, 2015.

[Cah04] Uncapher M. Kilpatrick L. Alkire M. T. Turner-J. Cahill, L. Sex-


related hemispheric lateralization of amygdala function in emotionally
influenced memory: An fmri investigation. Learning Memory, 2004.

[Can02] Desmond J. E. Zhao Z. Gabrieli J. D. E. Canli, T. Sex differences in


the neural basis of emotional memories. Proceedings of the National
Academy of Sciences, 2002.

[Cli23] Clevland Clinic. Amygdala. Journal of Medicine, 2023.

[Coo99] Tabibnia G. Breedlove S. M. Cooke, B. M. A brain sexual dimorphism


controlled by adult circulating androgens. Proceedings of the National
Academy of Sciences, 1999.

[Coo05] Woolley C. S. Cooke, B. M. Sexually dimorphic synaptic organization


of the medial amygdala. The Journal of Neuroscience, 2005.

11

174
[Dav02] Pizzagalli D. Nitschke J. B. Putnam K. Davidson, R. J. Depression:
Perspectives from affective neuroscience. Annual Review of Psychology,
2002.

[DO223] Dictionary. Unknown Journal, 2023.


[Fra00] Kraemer G. W. Shelton S. E. Baker E. Kalin N. H. Uno H. Franklin,
M. S. Gender differences in brain volume and size of corpus callosum
and amygdala of rhesus monkey measured from mri images. Brain
Research, 2000.

[Fu09] Parahoo K. Fu, C.-M. Causes of depression: Perceptions among people


recovering from depression. Journal of Advanced Nursing, 2009.

[Ham05] S. Hamann. Sex differences in the responses of the human amygdala.


The Neuroscientist, 2005.

[Har00] Bookheimer S. Y. Mazziotta J. C. Hariri, A. R. Modulating emotional


responses: Effects of a neocortical network on the limbic system. Neu-
roReport, 2000.

[Iso15] Matsuo T. Yamaguchi T. Funabiki K. Nakanishi S. Kobayakawa R.


Kobayakawa K. Isosaka, T. Htr2a-expressing cells in the central amyg-
dala control the hierarchy between innate and learned fear. Cell, 2015.

[Jan15] Tye K. M. Janak, P. H. From circuits to behavior in the amygdala.


Nature, 2015.

[Kac19] Raznahan A. Satterthwaite T. D. Kaczkurkin, A. N. Sex differences


in the developing brain: Insights from multimodal neuroimaging. Neu-
ropsychopharmacology, 2019.
[Kal20] N. H. Kalin. The critical relationship between anxiety and depression.
American Journal of Psychiatry, 2020.

[Kue17] C. Kuehner. Why is depression more common among women than


among men? The Lancet Psychiatry, 2017.

[Li17] Graham B. M. Li, S. H. Why are women so vulnerable to anxiety,


trauma-related and stress-related disorders? the potential role of sex
hormones. The Lancet Psychiatry, 2017.

[Lis12] Gamer M. Berger C. Grossmann A. Hauenstein K. Heinrichs M. Her-


pertz S. C. Domes G. Lischke, A. Oxytocin increases amygdala re-
activity to threatening scenes in females. Psychoneuroendocrinology,
2012.

[LO21] Hurley R. A. López-Ojeda, W. Sexual dimorphism in brain develop-


ment: Influence on affective disorders. The Journal of Neuropsychiatry
and Clinical Neurosciences, 2021.

12

175
[Mar99] H. J. Markowitsch. Differential contribution of right and left amygdala
to affective information processing. Behavioral Neurology, 1999.

[Mar17] Halari M. Eliot L. Marwha, D. Meta-analysis reveals a lack of sexual


dimorphism in human amygdala volume. NeuroImage, 2017.
[Nis81] Arai Y. Nishizuka, M. Sexual dimorphism in synaptic organization in
the amygdala and its dependence on neonatal hormone environment.
Brain Research, 1981.

[OB15] Fortes-Marco L. Otero-Garcı́a M. Lanuza E. Martı́nez-Garcı́a F.


Olucha-Bordonau, F. E. Amygdala. In The Rat Nervous System,
2015.

[oMN] National Library of Medicine (NLM). Neuroanatomy, amygdala.


[Org23] World Health Organization. Depression. Fact Sheets, 2023.

[Qin16] Gong G. Qing, Z. Size matters to function: Brain volume correlates


with intrinsic brain activity across healthy individuals. NeuroImage,
2016.
[RL04] G. Richter-Levin. The amygdala, the hippocampus, and emotional
modulation of memory. The Neuroscientist, 2004.
[Rub16] Mahajan G.-May W. Overholser-J. C. Jurjus G. J. Dieter-L. Herbst N.
Steffens D. C. Miguel-Hidalgo J. J. Rajkowska G. Stockmeier C. A.
Rubinow, M. J. Basolateral amygdala volume and cell numbers in
major depressive disorder: A postmortem stereological study. Brain
Structure and Function, 2016.

[Sch11] Peters J. Bromberg-U. Brassen S.-Menz M. M. Miedl S. F.-Loth E.


Banaschewski T. Barbot A. Barker-G. Conrod P. J. Dalley J. W. Flor
H. Gallinat J. Garavan H. Heinz A. Itterman B. Mallik C. Mann K. . . .
Büchel C. Schneider, S. Boys do it the right way: Sex-dependent amyg-
dala lateralization during face processing in adolescents. NeuroImage,
2011.
[Sim14] Moulton E. A. Linnman C.-Carpino E. Becerra L. Borsook-D. Simons,
L. E. The human amygdala and pain: Evidence from neuroimaging:
Human amygdala and pain. Human Brain Mapping, 2014.
[Uem12] Matsui M. Tanaka C. Takahashi T.-Noguchi K. Suzuki M. Nishijo-H.
Uematsu, A. Developmental trajectories of amygdala and hippocam-
pus from infancy to early adulthood in healthy individuals. PLoS ONE,
2012.

[Wan19] Xu Q. Luo J. Hu M.- Zuo C. Wang, Y. Effects of age and sex on


subcortical volumes. Frontiers in Aging Neuroscience, 2019.

13

176
[Wen17] Poh J. S. Ni S. N. Chong Y.-S. Chen H.-Kwek K. Shek L. P. Gluckman
P. D. Fortier M. V. Meaney M. J. Qiu A. Wen, D. J. Influences of
prenatal and postnatal maternal depression on amygdala volume and
microstructure in young children. Translational Psychiatry, 2017.

[Xu20] Zuo C. Liao S. Long Y. Wang Y. Xu, Q. Abnormal development pat-


tern of the amygdala and hippocampus from childhood to adulthood
with autism. Journal of Clinical Neuroscience, 2020.
[Zha21] Lu L. Bu X. Li H. Tang S. Gao-Y. Liang K.-Zhang S. Hu X. Wang
Y. Li L. Hu X. Lim K. O. Gong Q. Huang X. Zhang, L. Alterations
in hippocampal subfield and amygdala subregion volumes in posttrau-
matic subjects with and without posttraumatic stress disorder. Human
Brain Mapping, 2021.
[Ši21] Tkalčić M. Vukić V. Mulc D. Španić E. Šagud-M. Olucha-Bordonau-F.
E. Vukšić M. R. Hof P. Šimić, G. Understanding emotions: Origins
and roles of the amygdala. Biomolecules, 2021.

14

177
Black Sea Grain Initiative: a Game-Theoretic
Analysis
∗†
Paari Dhanasekaran
October 13, 2023

Abstract
In July 2023, the Black Sea Grain Deal expired just a year after its
inception. It has become a very popular topic due to its significant im-
plications for grain and staple food prices. There has been an abundance
of empirical analysis to understand the situation, but the recent develop-
ments in the Black Sea Grain Deal have not been examined using a game-
theoretic approach. This paper provides a game-theoretic viewpoint of the
Black Sea Grain Deal with the focus on the breakdown. Using principles
of game theory, I develop an infinitely repeated game with a defined set of
players, actions, and preferences expressed through payoffs. By analyzing
the game for sub-game perfect Nash equilibria, there is a clearer under-
standing of the breakdown of the Black Sea Grain Deal and its future
implications. I finish by discussing possible extensions and variations of
the model along with what conditions need to be met for a game-theoretic
approach to be viable in general international relations settings.

1 Introduction
On February 2022, Russia began its full-scale invasion of Ukraine. Ukraine’s
ability to export has been severely hampered by Russia’s invasion [UNC22].
Before the war, 90% of Ukrainian crop exports went through ports at the Azov
and Black Seas which became inaccessible due to Russian aggression [OEC22].
On July 2022, Turkey, Russia, Ukraine, and the U.N. signed the Black Sea Grain
Deal. The deal allowed Ukraine to safely export grain, other food, and fertil-
izer from three Black Sea ports: Chornomorsk, Odesa, and Yuzhny/Pivdennyi
[UNC22]. Along with the Black Sea Grain Initiative, the U.N. established an
agreement with Russia “to facilitate the unimpeded exports to world markets
of Russian food and fertilizer (including the raw materials required to produce
fertilizers) to world markets” [Ped23]. The UN brokered these two deals with
the aim of lowering food prices. To some extent, the deal was successful in its
∗ Advised by: Jack Adeney of the California Institute of Technology
† Barrington High School

178
goal to reduce food prices [UNC22]. However, the deal had a finite term limit
of 120 days.
The deal first naturally expired on November 2022. After consideration and
further discussion with the UN, Russia agreed to continue the Black Sea Grain
Initiative for another 120 days. The FAO Price Index continued to decrease. On
March 2023, the U.N. met with Russia to discuss another extension. Moscow’s
agreement was contingent on the removal of Western sanctions. This led to a
stall in the deal on May 2023, which required further talks that led to another
60-day extension [Ped23]. On July 2023, Russia said they were suspending co-
operation with the deal once it reached its expiration date. Russia said that
the agreement concerning their food and fertilizer exports must be met first
before returning to the deal. For that, they have demanded that the Russian
Agricultural Bank is reconnected to the SWIFT payment system and that re-
strictions hampering their agricultural exports (i.e. shipping, insurance) are
lifted [Nic23] [Bon23]
This breakdown has been primarily studied from an empirical standpoint,
with a particular focus on sanctions and restrictions, to explain why the Black
Sea Grain Deal broke down [HS23] [Bon23]. However, there are still lessons to
be learned from a game theoretic approach. By doing so, I conduct a game-
theoretic analysis, where game theory is defined in [Rub94] as “a bag of an-
alytical tools designed to help us understand the phenomena that we observe
when decision-makers interact. The basic assumptions that underlie the the-
ory are that decision-makers pursue well-defined exogenous objectives (they
are rational) and take into account their knowledge or expectations of other
decision-makers behavior (they reason strategically)” (p.1). In response to com-
plex, real-world phenomena, game theory provides a simple and clean structure
that can be used to analyze well defined equilibrium outcomes. Game theory
helps understand outcomes concerning decision-makers whose outcomes are in-
terdependent on others’ actions. Therefore, game theory is a powerful tool of
analyzing international relations. The Black Sea Grain Initiative has clear ac-
tors who have certain actions and preferences that inform those very actions.
Structuring a game around that can give clear, decisive outcomes concerning the
agreement’s breakdown. The merits of game-theoretic analysis will be discussed
in more detail in the literature review.

2 Literature Review

The breakdown of the Black Sea Grain Deal is clearly an issue of interna-
tional relations. The literature has thoroughly explored the links between game
theory and international relations in general. [Cor01] explores the variety of
international relations scenarios in which a game-theoretic approach could be
utilized. With a focus on interaction between nation-states, the primary issues
are security and economics. Like most economic fields, game theory is founded
in the principal influence of individual rationality, meaning that players or ac-

179
tors always play with the aspiration to maximize their own individual payoffs.
Applying this to an international relations context, countries will take the action
that most benefits themselves.
The area of [Cor01] that is of primary interest is the discussion of interna-
tional crises (p.193-195). International crises are characterized specifically “by
the events that take place when one or more nation-states perceive that their se-
curity is suddenly, immediately and seriously threatened by actions proposed or
performed by other nation-states or by events accidentally taking place in them”
(p.193). [Cor01] boils these crises down to two actions: confrontation and coop-
eration where the “threatening nation-state” attempts to force the “threatened
nation-state” to follow their demands. Simultaneously, the “threatened nation-
state” is trying to make the other nation stop their demands. Other papers
explore specific crises in detail.
[Zag14] analyzes several games with varying structures, actions and pay-
offs that attempt to model the Cuban Missile Crisis (CMC). Of the mod-
els which [Zag14] has examined, the one most similar to this paper’s aims is
Thomas Schelling’s 1966 Chicken Game (p.22-26) where the worst outcome is
mutual defection and so one player would yield to the other. Schelling believed
whatever side of the CMC pushed the issue first would force the other to ca-
pitulate and “swerve,” gaining the advantage. This led Schelling to attribute
the U.S.’s ‘victory’ to Kennedy threatening brinkmanship. [Zag14] explains how
Schelling’s model was later proven wrong using White House tapes. The tapes
showed Kennedy wanted to use blockades as a way to buy time for renegotia-
tion (p.24). [Zag14] shows the significance of the real-world context of the crises
being modelled. New discoveries and developments of understanding of a crisis
can debunk a model that was previously supported. [Zag14] demonstrates how
models have been developed to explain real-world events, not limited to but
including the Cuban Missile Crisis. This is very applicable in modelling and
understanding the Black Sea Grain Deal’s breakdown.
A key component of the situation around the Black Sea Grain Deal was
the several expirations and subsequent renegotiations that occurred. Frequent
renegotiation in international agreements makes models of repeated games a
suitable tool of analysis for agreements and their breakdowns. [Pea91], [Sla04],
and [GT20] thoroughly explore the technical aspects of repeated games with
discounted payoffs. [GT20] covers several variations of repeated games concern-
ing monitoring and information while [Pea91] primarily focuses on repeated
games concerning self-enforced agreements referencing several proofs and folk
theorems concerning repeated games, sufficient patience/discount factors and
defining repeated games and their equilibria. [GT20] and [Pea91] provide a vi-
tal, mathematical understanding of repeated games and their equilibria. [Sla04]
tackles specific strategies used in repeated games such as grim trigger, tit for
tat, limited retaliation, deviate once (DEV1L), Grim DEV1L, and Pavlov. He
takes it a step further and assess whether combinations of the some of the afore-
mentioned strategies could be supported as sub game perfect equilibria (2004).
[Kan08] specifically focuses on how repeated games are a setting that en-
courages mutual cooperation. [Kan08] highlights a key issue in international

180
agreements; oftentimes, there is no body powerful enough to enforce an interna-
tional agreement. With there being no explicit commitment device within the
terms of the Black Sea Grain Deal, there is no external force mandating both
sides to cooperate. So, it is best to model the deal in terms of a non-cooperative
game. [Kan08] adds that a long-term relationship with several interactions is
an environment most suitable to establish mutual cooperation, especially when
formal contracts are too costly or impossible to enforce.

3 Model
3.1 Model Setup
Although the Black Sea Grain Initiative was signed by the U.N., Russia,
Ukraine and Turkey, the model incorporates two players: Russia and the U.N.
Turkey has similar interests to the U.N. which makes their payoffs identical. The
U.N having more power and influence and Turkey makes the U.N the primary
player between the two. As a result, their combined preferences are modeled
as a single actor’s, which is simply called the U.N. Although the deal concerns
Ukrainian exports, Ukraine is essentially a bystander in the Black Sea Grain
Deal. Unlike the U.N., they do not have the power to control the restrictions
on Russia, meaning they do not have any action spaces in this game. These
considerations also allow the use of more standardized games that would not
be viable with more than two players. Although in real terms, negotiations
of international agreements can be very complex, for the sake of modelling, I
believe it is appropriate to collapse each actor’s action space into two actions.
They both have essentially binary choices. For the U.N, they can either offer
concessions to Russia or not. And for Russia, they can either return to the
Black Sea Grain Deal or not.
For the U.N., cooperating entails renegotiating the deal and giving some
concessions. Defecting would mean the U.N. ends negotiations over the Black
Sea Grain Initiative and provides no concessions to Russia. For Russia, cooper-
ating entails returning to the Black Sea Grain Initiative. Defecting would mean
Russia does not return to the Grain Deal.
The mutually beneficial outcome would be both sides cooperating and re-
newing a renegotiated Black Sea Grain Deal. Both sides are worse off if there
is no cooperation. The U.N’s payoff becomes negative one because no coopera-
tion leads to lower grain exports and higher prices which hurts their efforts to
combat food insecurity, through the World Food Programme, along with the
welfare of member nations.
The U.N and Russia would benefit the most by exploiting the other. For the
U.N., exploiting would mean they do not renegotiate but Russia still decides to
cooperate and return to the Black Sea Grain Deal. For Russia, cheating would
mean not returning to the Deal when the U.N. makes concessions to renegotiate.
For both sides, being exploited leads to the worst payoff.

181
UN\Russia C D
C 3,3 -2,5
D 4,-1 -1,0

Figure 1. Model of the Prisoner’s Dilemma stage game used to analyze the
Black Sea Grain Initiative
UN\Russia C D
C 3,3 -2,5
D 4,-1 -5,-5

Figure 2. Model of the Chicken Game Example

3.2 Model Analysis


The prisoner’s dilemma stage game seems appropriate, due to reasons out-
lined in the previous section, to model the developments surrounding the Black
Sea Grain Deal. It best highlights how while both players would benefit more
from just mutually cooperating, the incentive to cheat and the fear of being
cheating would lead to both sides deciding to not cooperate.
A one-stage strategic game is defined as a game played with no repeated
interactions [Rub94]. The prisoner’s dilemma stage game from Figure 1 is also
a one-stage game. A key attribute of the prisoner’s dilemma and other one-
stage games are pure strategy Nash equilibria. Pure strategy Nash equilibria
are sets of responses where the players cannot make a unilateral deviation that
provides a higher payoff. Pure strategy Nash equilibria are strong indicators of
the possible final outcomes. This is because they represent when both players
follow their respective optimal actions. For example in Figure 1, there is just
one pure strategy Nash equilibria under the assumption it is only played for
one round: (D,D). Although both sides could mutually cooperate, both choose
to defect. Both defect in hopes of either exploiting the other player (U.N and
Russia get a higher payoff from (D,C) and (C,D) respectively) and in fear of
being cheated (U.N and Russia receives their lowest payoffs from (C,D) and
(D,C) respectively). In the prisoner’s dilemma, defecting is the U.N and Russia’s
dominant strategy, meaning it is the action they will take regardless of the other
player’s response. Once both sides defect, neither player can unilaterally make
a profitable deviation, holding the other player’s strategy fixed. If the U.N
switches from defection to cooperation, their payoff lowers by 1 (-1 v -2), and
if Russia is the one that switches from D to C, their payoff is worse (0 vs -1)
Therefore, (D,D) is a Nash equilibrium which reflects the realistic outcome given
both the U.N and Russia’s preferences.
While the prisoner’s dilemma game is commonplace in modelling interna-
tional relations, it is vital to consider alternatives. As aforementioned in [Cor01],
the chicken game is a popular alternative stage game to model international
crises (p.194). However, there is a key structural component of the chicken
game that does not apply to the Black Sea Grain Initiative. In a chicken game
(Figure 2), the worst payoffs for both players occur at mutual defection. The

182
pure strategy Nash equlibria become (C,D) and (D,C). The implication of this
is that the U.N. and Russia would rather be exploited by the other side then
both sides not cooperating. If Russia chooses to defect (drop out of the deal),
the U.N would obviously rather not give concessions. However, the equilibria
of the chicken game supports the very opposite. In contrast, the prisoner’s
dilemma stage game reflects the strategic realities for the U.N and Russia. This
is especially shown when examining each stage game’s equilibira.
First looking at (D, C), the U.N would not want to switch to cooperating and
Russia would not want to switch to defection as the payoffs would be worse. The
same concept applies to (C, D). As mentioned previously, both pure strategy
equilbria of the game support the idea that Russia and the United Nations would
rather be taken advantage of than also defecting, which is very unrealistic. So
after examining the pure strategy Nash equilibria, the prisoner’s dilemma is
a more accurate representation and model of how Russia and the U.N would
behave compared to the chicken game.
Due to the Black Sea Grain Deal having finite extension lengths, interactions
surrounding renegotiation have already occurred multiple times. And with no
end to the Ukraine-Russia conflict in the foreseeable future, an infinitely re-
peated discount game, where the subsequent round’s payoffs are discounted by
a factor of δ, δ ∈ (0, 1), seems to be an appropriate method to analyze the recent
tension around the Black Sea Grain Deal.

3.3 Key Definitions


In preparation for my game-theoretic analysis, certain mathematical con-
cepts must be defined. The first of which is an infinitely repeated game. An
infinitely repeated game as defined in [Rub94]:
Let strategic game G = {N, (Ai ), (i )}; let A be the set of every player’s
(i ∈ N ) available actions. A = ×i∈N Ai ; let i be player i’s preference relation on
A: i on A = ×j∈N Aj where ×j∈N Aj the set of outcomes of A. Applying this to
the prisoner’s dilemma stage game, Russia and the U.N would be the players of
N, and the actions in set A for both Russia and U.N would be cooperate and
defect.
An infinitely repeated game of G is an extensive game with perfect infor-
mation and simultaneous moves (N, H, P, (∗i )). H is the set of histories, which
stores the sequence of actions played by all players. This is a fundamental dif-
ference between a one-stage game and a repeated game. With one round, there
is no prior history to be considered. However, in a repeated game, players will
consider all of their previous moves along with everyone else’s to inform the ac-
tion they decide to take, widening the possible number of strategies. Therefore,
a history is necessary. In repeated games, P(h) maps a history to a player for
each non terminal history h ∈ H [Rub94]. This means it identifies which player
moves and when they move.
While there are variations of an infinitely repeated game involving imperfect
monitoring or incomplete information that were considered, I concluded that a
model with complete information seemed to be more applicable to the Black Sea

183
Grain Initiative. This is because the U.N and Russia can clearly observe what
the other side is doing. The U.N can tell if Russia decides to cooperate or defect
from the Black Sea Grain Deal and Russia can tell if the U.N. has decided to
make concessions or not. Thus, it is reasonable that both sides have complete
information on the histories of play in the infinitely repeated stage game model.
Utilizing an infinitely repeated game has significant implications for equi-
libria. Contrary to a game played for only one stage, any mutually beneficial
outcome can be supported as an equilibrium when players interact repeatedly.
This fact is formally stated in folk theorems [Kan08]. Several folk theorems ex-
plore the idea of equilibria in infinitely repeated discounted games. One of the
more prevalent theorems is that any individually rational strategy profile can
be supported as an equlibrium if δ is close to 1. However, more specific folk the-
orems have been developed. [GT20] references Fudenberg and Maskin’s (1986)
folk theorem: “If the number of players is 2 or if the set feasible payoff vectors has
non-empty interior, then any payoff vector that is feasible and strictly individu-
ally rational is a subgame perfect equilibrium of the discounted repeated game,
provided that players are sufficiently patient” [GT20]. Essentially if the players
are patient enough, any strictly individually rational strategy can be supported
as an equilibrium. Strictly individually rational strategies for any player i are
those that yield a higher payoff than player i’s min-max strategy [GT20]; the
min-max strategy is the payoff a player can guarantee themselves in any equi-
librium as explained by [Kan08], which is like the worst-case scenario strategy.
The folk theorem is very broad and lacks predictive power about specific
equilibria. It merely suggests that any individually rational strategy could be
an equilibrium if the players are patient. Over time, the literature has explored
and established more specific folk theorems. [Pea91] explored several of these
folk theorems, of which Friedman’s (1971) theorem is especially pertinent to the
focus of this paper: “Let G = (A1 ..., AN ; Π1 , ...ΠN ) have a Nash equilibrium
e = (e1 , ..., en ) ∈ A, and let q = (q1 , ..., qn ) ∈ A satisfy Πi (q) > Πi (e) for each
i ∈ N . Then for δ sufficiently close to 1, there is a sub-game perfect equilibrium
of G∞ (δ) in which q is played every period on the equilibrium path” (Pearce,
1991). Π denotes the payoff for each player i. Overall, the theorem is very
significant as it supports repeated mutual cooperation as a potential subgame
perfect equilibrium depending on the players’ patience. Similar to the Nash
equilibrium in a one stage game, subgame perfect equilibria are strong indicators
of final outcomes for infinitely repeated games. So for the infinitely repeated
Prisoner’s dilemma stage game model, subgame perfect equilibria are key to
analyze.
As defined in [Rub94]: “a subgame perfect equilibrium of an extensive game
with perfect information (N, H, P, (∗i )) is strategy profile s∗ such that for every
player i ∈ N and every nonterminal history h ∈ H\Z for which P (h) = i we
have
Oh (s∗−i |h , s∗i |h )i |h Oh (s∗−i |h , si |h )
for every strategy si of player i in the subgame T (h),” (p.97). Oh represents
the outcome (the payoff) of a certain strategy profile. What the definition is

184
conveying is that the outcome of player i following the s∗i strategy is greater
than them deviating and following some other strategy si holding every other
player’s strategy s∗−i fixed. Essentially, a strategy profile is a subgame perfect
equlibrium if and only if there are no unilateral, profitable deviations in strategy
a single player can make. This is why the subgame perfect Nash equilibrium is
often referred to as the ’credible threat’. This strongly applies to the Black Sea
Grain Initiative because both Russia and the U.N have a threat to defect which
would severely punish the other player compared to both sides cooperating.
In an infinite stage game, there are a wide number of possible strategies
varying in complexity. Using the aforementioned folk theorems, it is possible
for mutual cooperation in every round to be a subgame perfect equilibrium in the
infinitely-repeated discounted stage game model. There are several strategies
that focus on achieving mutual cooperation: naively cooperating every round or
playing tit-for-tat where player i plays the same move their opponent played the
round before. However, a common strategy to achieve mutual cooperation is the
grim trigger strategy. The aim of a grim-trigger is to use the threat of permanent
defection to enforce cooperation. A grim trigger strategy entails always choosing
to cooperating until the opposing player defects. Following that, the player using
a grim trigger would defect forever, never returning to cooperation. Based on
Friedman’s folk theorem, it is possible for mutual cooperation, which offers a
higher payoff than the one-stage Nash, to be a sub-game perfect. [Sla04] outlines
the strategy in a rather eloquent fashion:


C if t = 0
si (ht ) = C if aτ = (C, C) for τ = 0, 1, ..., t − 1


D otherwise

4 Results
To find whether mutual cooperation with a grim trigger is a sub-game
perfect equilibrium, the payoffs of the strategy and its deviations need to be
considered. Using the one-shot deviation principle, as long as there is a single
profitable deviation, a strategy cannot be considered sub-gameperfect [Rub94].
∞ t
The payoff of always cooperating for either player would be t=0 (3)δ where

t increases by 1 with the next stage of the game. For any δ ∈ (0, 1), t=0 δ t
1 ∞
yields the discounted sum 1−δ . The U.N’s payoff for cheating is 4+δ t=0 (−1)δ t
Under the grim trigger, the U.N would get a payoff of 4 because Russia would
still cooperate while the U.N defects. For future rounds however, Russia would
defect forever which means the U.N’s optimal response would be to also defect
forever (yielding a payoff of -1 which is accordingly discounted by a factor of

δ each round). Russia’s payoff for cheating is 5 + δ t=0 δ t ∗ 0 which ends up
just equalling 5. Under the grim trigger strategy, Russia would get a payoff of 5
because the U.N would still cooperate. For future rounds, the U.N would defect
forever, meaning Russia would also do the same as it yields a higher payoff
compared to cooperating (0 vs -1). Meaning Russia would receive a payoff of 0

185
for all future rounds.
In order for the grim trigger strategy, the payoff of always cooperating has
to be greater than deviating, cheating one round. This can be represented by
the following inequalities:

 ∞

(3)δ t ≥ 4 + δ (−1)δ t
t=0 t=0

Figure 3. Inequality required to be met to support U.N’s grim trigger strategy



 ∞

(3)δ t ≥ 5 + δ δt ∗ 0
t=0 t=0

Figure 4. Inequality required to be met to support Russia’ grim trigger


strategy
3
Using the discounted sum, the U.N.’s inequality (Figure 3) becomes 1−δ ≥
δ
4 − 1−δ . Multiplying both sides by (1 − δ) yields 3 ≥ (4 − 4δ) − δ which is
equivalent to 3 ≥ 4 − 5δ. Rearranging the inequality yields 5δ ≥ 1. The solution
to the inequality is δ ≥ 15 . This means that δ for the U.N must be at least 1/5
for the grim trigger to be supported. Unless the U.N is extremely impatient
with little care for the future, they will follow a grim trigger strategy.
Applying
∞ the same process to Russia (Figure 4), the summation on the right
side ( t=0 δ t ∗ 0) of the inequality simply becomes 0. Multiplying both sides
by (1 − δ) becomes 3 ≥ 5 − 5δ. Rearranging the inequality yields 5δ ≥ 2. The
solution to the inequality is δ ≥ 25 for Russia. Compared to the U.N, Russia
requires a higher δ value for the grim trigger strategy to be supported.
Overall, in order for the grim trigger strategy to be a subgame-perfect equi-
librium, it needs to be followed by both players. Since Russia has the higher
threshold at δ ≥ 25 , 25 is the minimum δ value for the grim trigger to be a
supported equilibrium. If however δ goes below 25 , the grim trigger is not a
supported equilibrium as cheating becomes a profitable deviation for Russia.
Based on the findings of the model analysis, the stability of mutual co-
operation under a grim trigger depends largely on Russia’s patience, which is
reflected by the discount factor. As aforementioned, the the δ value could satisfy
the U.N’s inequality while not satisfying Russia’s. The moment the discount
factor goes below 25 , it is in Russia’s best interest to cheat the U.N and perma-
nently drop out of the deal. In my model, the key to understanding the recent
developments of the Black Sea Grain Initiative is dissecting how the value of δ
can change.

5 Practical Implications
First, it is important to truly understand what δ represents. δ is the factor
by which future payoffs are discounted. A higher delta means future payoffs are
more valuable when normalised to the value of present payoffs. Extending this

186
idea, δ represents the value placed on the future relative to the present. If δ
were to equal 1, that means the future resources/payoffs are equally valuable as
those in the present. A δ of 0 implies the future has no value. Since the discount
factor represents the value of the future, one should consider the possibility of
it varying. This variance can be determined by real world context. Russia has
been embroiled in a war with Ukraine for the last year and a half. As a war
drags on, a country cares more about the present than the future. The war is
causing Russia to divert more present resources, meaning less value in future
resources. This explains the discount factor value lowering. This phenomenon
has been observed since the deal’s inception last July. With each term limit,
Russia was gradually more reluctant to extend. This is very apparent during
March and May 2023 when Russia only agreed to a 60 day extension, half of
the original 120 day extension terms agreed upon. Russia permanently backing
out can be explained by their discount factor dropping below the supported
threshold, leading to the grim trigger strategy not holding as an equilibrium.
Another factor that could change the δ thresholds to support equilibrium
would be change in payoffs. During repeated re-negotiation and as time elapses,
payoffs can possibly change [Jer88]. For example, if the payoff for cheating
the other player increased, both countries would have a stronger incentive to
deviate. Thus, requiring a higher δ to keep them cooperating. An example
would be Russia being more incentivized to cheat the U.N and never return to
the Black Sea Grain Deal. If Russia’s payoff for deviating increased by some
value ϵ, where ϵ > 0, the new inequality for Russia to mutually cooperate
becomes

 ∞

(3)δ t ≥ (5 + ϵ) + δ δ t ∗ (0 + ϵ)
t=0 t=0
3 ϵ
Using the discounted sum, the inequality becomes 1−δ ≥ (5 + ϵ) + δ( 1−δ ).
Multiplying both sides by (1 − δ) yields 3 ≥ 5 − 5δ + ϵ − δϵ + δϵ. Simplifying,
the inequality becomes 5δ ≥ 2 + ϵ. The solution to the inequality is δ ≥ 2+ϵ 5 .
The minimum delta for the grim trigger to be stable increases by 5ϵ .
Since the U.N’s goal is to establish lasting, mutual cooperation for the Black
Sea Grain Initiative, they may offer more and more concessions to Russia over
time, increasing their payoff for cooperating, reflected by some increase in ϵ This
changes the grim-trigger inequality for Russia to become

 ∞

t
(3 + ϵ)δ ≥ 5 + δ δ t ∗ 0.
t=0 t=0
3+ϵ
Using the discounted sum, the inequality becomes 1−δ ≥ 5. Multiplying
both sides by (1 − δ) yields 3 + ϵ ≥ 5 − 5δ. Rearranging yields the inequality
5δ ≥ 2 − ϵ. The solution to the inequality is δ ≥ 2−ϵ
5 . By increasing Russia’s
payoff for cooperation by epsilon, the minimum δ decreases by 5ϵ .

10

187
6 Limitations and Possible Extensions of the Game-
Theoretic Approach
When utilizing game theory as an analytical tool, there should be a great
care and caution. Steven J. Brams addresses this in his 2000 paper. Of the
common issues [Bra00] highlight, two apply most to the model outlined in this
paper: Misspecifying the rules and confusing the goals with rational choice.
[Bra00] emphasises that the rules outlined in a game-theoretic model should
reflect how the players would act in the very situation that’s being modeled
(p.222). [Bra00] articulates the intuitive idea that the model should reflect how
the players in the model would realistically act in the given situation. Another
point [Bra00] highlights is that goals and rationality aren’t the same. For exam-
ple, a change in strategy from short to long-term is not varying rationality but
rather, it is a variance in goals with the same underlying rationality (p.222). If
just a one-stage game was used, it would not realistically affect the decision-
making a large country or international governing body (Russia and the U.N)
would take. Through the use of an infinite-stage game, Russia and the U.N’s
long-term lens for decision making is reflected by the model.
The model setup and analysis assumed stable payoffs. While the payoffs of
the repeated stage game were discounted by δ, the stage game payoffs themselves
remained fixed throughout. As [Jer88] notes, preferences evolve over time. In
a constantly changing international environment, preferences, and subsequently
payoffs, are likely to change. While this was briefly explored in the Practical
Implications by increasing Russia’s payoffs to defect and cooperate by some
positive epsilon, more research needs to be done to exactly quantify the payoffs
and how they evolve over time.
While the infinitely repeated Prisoner’s dilemma provides a relatively com-
prehensive analysis of the current developments surrounding the Black Sea Grain
Initiative, future developments may require an adjustment or rethinking of the
model. Even now, there may be opportunities to expand and advance the cur-
rent model. While doing so, it is important to remember what the core purpose
of game-theoretic models is in international relations.. They are supposed to
provide structure that aligns with players’ realistic thinking and actions which
can be analyzed and studied. Adding or reinventing the model should only be
done after extensive and thorough consideration.

7 Conclusion
This paper provided a game-theoretic analysis on the Black Sea Grain Ini-
tiative using an infinitely repeated Prisoner’s Dilemma stage game. It is possible
to structure a model such that cooperation primarily depends on Russia’s pa-
tience. The war in Ukraine carrying on for over a year and a half has decreased
Russia’s valuation of the future. Slight increases in the payoff to defect can in-
crease the minimum δ for Russia to cooperate. However, the U.N can offer more
concessions to increase Russia’s payoff to keep cooperating, therefore decreasing

11

188
the minimum δ required. These findings outline the cause of the breakdown: a
lack of patience on Russia’s behalf. This lack of patience (a low valuation of
the future)makes Russia unwilling to extend the Deal as it is not as beneficial
to them. However, the findings presented also find a solution to preserve coop-
eration around the Black Sea Grain Deal: offering more concessions to Russia
to incentivize a return to the Black Sea Grain Deal. However, the U.N has to
offer enough concessions that renewing the Black Sea Grain Deal is beneficial
to Russia, even with a lower patience, in order for Russia to cooperate.

12

189
References
[Bon23] Courtney Bonnell. Russia halts landmark deal that allowed ukraine
to export grain at time of growing hunger. AP News, 2023.

[Bra00] Steven J. Brams. Game theory: Pitfalls and opportunities in applying


it into international relations. International Studies Perspective, 2000.
[Cor01] Hector Correa. Game theory as an instrument for the analysis of
international relations. Ritsumeikan Annual Review of International
Studies, 2001.

[GT20] Oliver Gossner and Tristian Tomala. Repeated games with complete
information. Complex Social and Behavioral Systems: Game Theory
and Agent-Based Models, 2020.
[HS23] Nigel Hunt and Jonathan Saul. Black sea grain deal: What’s next
now that russia has pulled out?, 2023.
[Jer88] Robert Jervis. Realism, game theory, and cooperation. World Politics,
1988.

[Kan08] Michihiro Kandori. Repeated games. The New Palgrave Dictionary


of Economics, 2nd Edition, Palgrave Macmillan, 2008.

[Nic23] Michelle Nichols. Russia could be ready for black sea grain deal talks,
but no evidence yet, us says. Reuters, 2023.

[OEC22] OECD. The impacts and policy implications of russia’s aggression


against ukraine on agricultural markets. OECD, 2022.

[Pea91] David G. Peaerce. Repeated games: Cooperation and rationality.


Cowles Foundation for Research in Economics, Yale University, 1991.

[Ped23] Raul (Pete) Pedrozo. The black sea grain initiative: Russia’s strategic
blunder or diplomatic coup? International Law Studies, 2023.

[Rub94] Martin J. Osborne Ariel Rubinstein. A course in game theory. MIT


Press, 1994.

[Sla04] Branislav L. Slantchev. Game theory: repeated games. Department


of Political Science, University of California-San Diego, 2004.

[UNC22] UNCTAD. The black sea grain initiative: What it is, and why it’s
important for the world, 2022.
[Zag14] Frank Zagare. A game-theoretic history of the cuban missile crisis.
Economies, 2014.

13

190
Automated Pneumonia Detection From Chest
X-ray Images Using Machine Learning

Lurvı̈sh Polodoo
October 17, 2023

Abstract
In this data science project, pneumonia detection was addressed using
Convolutional Neural Networks (CNNs) applied to chest X-ray images.
With the advancement of deep learning techniques, CNNs have emerged
as a powerful tool for image classification tasks. By leveraging the capa-
bilities of CNNs, this research aims to develop a robust and automated ap-
proach to classifying pneumonia from chest X-ray images, enabling timely
and accurate diagnosis. The study includes comprehensive dataset de-
tails, explores supervised learning principles, and delves into binary clas-
sification techniques. Additionally, the research thoroughly examines the
impact of different image dimensions on the model’s performance, while
utilizing regularization to prevent overfitting. The developed CNN model
achieves high accuracy on both the training and validation datasets, show-
casing its potential in pneumonia detection. In addition to the technical
aspects, potential applications in medical imaging are highlighted, lim-
itations are addressed, and areas for improvement are proposed in this
research. While the CNN model shows promise, it is designed as a valu-
able aid to medical professionals, enhancing early detection and screening
processes.

1 Introduction
Pneumonia is a common and potentially life-threatening respiratory infection
that disproportionately affects young children, leading to a significant number
of deaths globally. In 2019, it claimed the lives of 740,180 children under the
age of 5, accounting for 14% of all deaths in this age group [Wor22].
Early detection and accurate diagnosis are crucial for effective treatment and
management of this disease. To combat this pressing public health challenge,
this research focuses on employing advanced machine-learning techniques to
improve the efficiency and reliability of pneumonia detection from chest X-ray
images.
∗ Advised by: Guillermo Goldsztein, Georgia Institute of Technology

191
With the rapid progress in deep learning and CNNs, it is believed that an
automated approach can assist medical professionals in the early detection of
pneumonia cases, thereby reducing the risk of complications and saving lives.
The indispensable role of healthcare experts in diagnosis is acknowledged, and
it is emphasized that the model serves as a valuable tool to complement their
expertise, rather than replace it. In section 2, a detailed account of the imple-
mentation of the Convolutional Neural Network (CNN) model for pneumonia
detection using chest X-ray images is provided. It covers aspects such as the
dataset description and source, supervised learning, binary classification, and
the architecture of neural networks. Additionally, it explores the concept of
image preprocessing, specifically investigating the impact of different image di-
mensions on the model’s performance.
The discussion delves into vital concepts like generalization error and over-
fitting through an exploration of model training in section 3. This exploration
notably emphasizes the implementation of regularization techniques, strategi-
cally applied to avert overfitting and enhance the model’s ability to adeptly
assimilate uncharted data. Subsequently, the following section undertakes a
comprehensive analysis of the model, with a heightened focus on the accuracy
of the pneumonia detection model.
Furthermore, this research explores the potential applications of the de-
veloped CNN model for pneumonia detection in medical imaging. Section 5
highlights the model’s significance in early pneumonia detection as well as its
limitations and potential areas for improvement. It emphasizes the importance
of conducting clinical validation studies to ensure real-world effectiveness and
safety. Additionally, the ethical implications of deploying AI models in health-
care are acknowledged, focusing on privacy, biases, and the ethical responsibility
of AI as a complementary tool to medical professionals’ expertise.
This project holds great promise in the field of medical imaging and has
the potential to significantly impact healthcare by improving the efficiency and
reliability of pneumonia detection.

2 Methodology
In this section, the implementation of the Convolutional Neural Network (CNN)
model for pneumonia detection is described. The full code implementation
is available on Kaggle [Pol23], and it includes the model architecture, data
preprocessing, and training process.

2.1 Dataset Description and Source


Diving into the dataset used for the research, which originates from Kermany,
Zhang, and Goldbaum, there is a compilation of 5,840 labeled Chest X-ray
images tailored for classification [KZG18]. Among these, 5,216 images were
earmarked for training, while 624 were reserved for testing. The training set
comprises 3,875 pneumonia images and 1,341 normal images, while the test-

192
ing set includes 390 pneumonia and 234 normal images. Notably, the training
dataset exhibits class imbalance, with more pneumonia cases than normal cases.
Class imbalance can impact the model’s performance, leading to biased predic-
tions. To address this, techniques like data augmentation, resampling, or class
weights can be explored [Jap01]. By mitigating class imbalance and refining
the approach, AI-assisted systems for pneumonia detection can become more
reliable and accurate, enhancing healthcare diagnostics.

2.2 Supervised Learning and Binary Classification


Delving into the foundational concepts that form the basis of the pneumonia
detection project, this section offers an overview of supervised learning and
binary classification. These fundamental principles play a crucial role in the
development of a precise and automated model for pneumonia detection from
chest X-ray images.

2.2.1 Supervised Learning


Machine learning is a branch of artificial intelligence that focuses on enabling
computers to learn from data and make predictions or decisions without explicit
programming [Int]. Supervised learning is a type of machine learning where
the algorithm learns from labeled data, meaning the input data is paired with
corresponding output labels. In this case, for each chest X-ray image in the
dataset, a binary label is assigned: 1 for cases with pneumonia and 0 for normal
(non-pneumonia) cases.
The goal of supervised learning is to develop a model that can accurately
predict the correct label (in this case, pneumonia or normal) for new, unseen
data. During the training process, the model learns patterns and features from
the labeled examples, enabling it to make predictions on new, unlabeled data.

2.2.2 Binary Classification


Binary classification is a specific type of supervised learning where the algo-
rithm’s task is to categorize input data into one of two possible classes [Mar].
In this project, the goal is to classify chest X-ray images into two categories:
pneumonia and normal (non-pneumonia) conditions. This binary classification
task is particularly relevant for pneumonia detection as it determines whether
a patient’s X-ray indicates the presence or absence of pneumonia.
Several machine learning algorithms can be used for binary classification, in-
cluding Support Vector Machines (SVMs), Naive Bayes, Decision Trees, Logistic
Regression, and Neural Networks. In this research, neural networks, specifically
Convolutional Neural Networks (CNNs), are employed, which have shown ex-
ceptional performance in image classification tasks.
By utilizing binary classification with supervised learning, the aim is to
develop a powerful and automated model that can accurately detect pneumonia

193
from chest X-ray images, providing valuable support to medical professionals in
their diagnostic process.

2.3 Neural Networks


Neural networks, the core architecture behind Convolutional Neural Networks
(CNNs), can be likened to the interconnected network of neurons in the human
brain. Just as the human brain consists of billions of interconnected neurons
that work together to process and analyze information, neural networks are
composed of interconnected artificial neurons, known as nodes.

Figure 1: Structure of an artificial neural network [Sof]

In this analogy, each artificial neuron in a neural network can be seen as


a simplified version of a biological neuron. Similar to how biological neurons
transmit electrical signals and communicate with one another through synapses,
artificial neurons receive input signals, perform computations, and transmit
output signals to other neurons within the network. [Kro08]

2.4 Activation Functions


Activation functions are essential in artificial neural networks as they introduce
non-linearity to the model’s output [RAS20]. Without activation functions, the
neural network would behave like a linear model, severely limiting its ability to
learn complex patterns and relationships in the data.
By incorporating activation functions, the neural network can transform the
output of a neuron in a non-linear way, enabling it to handle sophisticated tasks
like image and speech recognition, natural language processing, and complex
pattern recognition [GWK+ 18]. In the model, ReLU (Rectified Linear Unit)
and Sigmoid activation functions were employed as the key components in the
neural network’s architecture.

2.4.1 ReLu Activation Function


The ReLU activation function returns 0 for any negative input x < 0 and returns
the input value itself for any positive input x >= 0. Mathematically, it can be

194
defined as:

f (x) = max(0, x)
So, if the input x is negative, the ReLU function will output 0, and if the
input x is positive (or equal to 0), the ReLU function will output x. This simple
non-linear activation function introduces non-linearity to the neural network,
which is essential for enabling the model to learn complex patterns and perform
well in various tasks, including image classification.

Figure 2: Graph of the ReLU Activation Function [Liu17]

By incorporating the ReLU activation function in the hidden layers of the


neural network, essential features from the input data are captured, enabling the
model to improve its ability to detect pneumonia accurately from chest X-ray
images.

2.4.2 Sigmoid Activation Function


In addition to the ReLU activation function used in the hidden layers, the sig-
moid activation function was employed in the output layer of the neural network.
The sigmoid function is commonly used for binary classification tasks, where the
goal is to categorize data into one of two classes. It scales the output values to
a range between 0 and 1, which is suitable for representing probabilities.
Mathematically, the sigmoid activation function can be defined as:
1
f (x) =
1 + e−x
The sigmoid function is particularly well-suited for the pneumonia detection
task, as it allows the model to output a probability score indicating the likelihood
that a given chest X-ray image belongs to the pneumonia class. Values close to
0 indicate low probability, while values close to 1 indicate high probability.

195
Figure 3: Graph of the Sigmoid Function [Pan19]

By incorporating the sigmoid activation function in the output layer, the


model gains the ability to generate probability-based predictions for individual
chest X-ray images. This characteristic renders it particularly well-suited for
binary classification, where the objective is to categorize X-ray images as either
indicative of pneumonia (1) or displaying normal conditions (0).

2.5 Image Preprocessing


In this research, an important investigation into the impact of image dimensions
on the performance of the pneumonia detection model was conducted. Experi-
ments were conducted involving three different image dimensions: 1272 x 1592
pixels (original dimension), 250 x 250 pixels, and 50 x 50 pixels. The principal
goal was to strike an optimal balance between accuracy and efficiency within
the classification model.

Figure 4: Comparison of Experiment Images with Varying Dimensions

196
Upon resizing the images to a dimension of 50 x 50 pixels, a noticeable in-
crease in blurriness was observed. This blurriness adversely affected the quality
of the images and resulted in the omission of crucial pneumonia-related in-
formation. As a consequence, the model encountered challenges in accurately
detecting pneumonia cases with this excessively low dimension.
To address this concern, the significance of opting for a higher image reso-
lution was duly recognized. Subsequently, experimentation was conducted with
an image dimension of 250 x 250 pixels, revealing enhancements in compari-
son to the 50 x 50 version. Nevertheless, a degree of information loss persisted
relative to the original images. Despite this drawback, discerning substantial
distinctions between the original and 250 x 250 images proved to be challenging
for the naked eye, underscoring the dimension’s ability to maintain a pragmatic
equilibrium between image quality and operational efficiency.
It is essential to acknowledge that using excessively high dimensions may sig-
nificantly prolong the training time of our model, affecting the overall efficiency
of obtaining results. Conversely, choosing dimensions that are too low can lead
to the loss of vital information crucial for the accurate detection of pneumonia.
In light of the experimentation, an image dimension of 250 x 250 pixels
was ultimately chosen as it offered a favorable trade-off between accuracy and
efficiency. It is important to recognize that the process of image resizing involves
a delicate compromise, where the preservation of vital information is balanced
with minimizing the computational complexity of the model. The choice of
an appropriate image dimension was made with the intention of optimizing the
performance of the pneumonia detection model, all the while upholding practical
feasibility.

3 Model Training
During the training process of the pneumonia detection model, two critical
concepts were encountered: generalization error and overfitting. These concepts
are essential in machine learning as they directly impact the model’s ability to
perform well on new, unseen data.

3.1 Generalization Error


The generalization error, also known as the out-of-sample error, refers to the
difference between the model’s performance on the training data and its perfor-
mance on new, unseen data [LLQ19]. In other words, it measures how well the
trained model can make accurate predictions on data it has never seen before.
The ultimate goal of machine learning is to develop a model that generalizes
well, making reliable predictions on real-world data.
The significance of generalization error lies in its impact on the model’s
practical usability. A model with low generalization error is more likely to per-
form well in real-world applications, providing valuable insights and supporting
decision-making processes. In contrast, a model with high generalization error

197
may yield unreliable and inaccurate predictions, limiting its practicality and
effectiveness.

3.2 Overfitting
Overfitting is a common issue encountered during the training of machine learn-
ing models. It occurs when a model performs exceptionally well on the training
data but fails to generalize effectively to new, unseen data. In essence, the model
becomes too complex and starts memorizing the noise and outliers present in
the training data, instead of learning the essential patterns.
When a model overfits, it loses its ability to generalize, leading to poor
performance on test data. Overfitting is particularly problematic in image clas-
sification tasks, as the model may learn to recognize specific features present
in the training images rather than capturing the essential characteristics of the
disease it is supposed to detect.

3.3 Regularization as a Technique to Prevent Overfitting


Regularization is a powerful technique used to prevent overfitting and improve
the generalization performance of machine learning models. It introduces ad-
ditional constraints on the model’s weights during training, discouraging the
model from becoming overly complex and overfitting to the training data.

3.4 L2 Regularization (Weight Decay)


L2 regularization, also known as weight decay, is a common form of regular-
ization used in neural networks. It involves adding a penalty term to the loss
function based on the magnitudes of the model’s weights.
Adding the L2 regularization term to the loss function incentivizes the model
to use smaller weight values, as larger weights would result in higher penalty and
loss. This helps prevent the model from relying too heavily on specific training
examples and encourages it to learn more general patterns from the data.

(a) Without Regularization (b) With Regularization

Figure 5: Training and Validation Accuracy with and without Regularization

198
The figure illustrates the impact of regularization on the model’s perfor-
mance. Without regularization, the training accuracy rapidly reaches a perfect
score of 1.00, while the validation accuracy struggles to surpass 75%. This dis-
crepancy between training and validation accuracies is a strong indication of
overfitting, where the model becomes too specialized in fitting the training data
but fails to generalize well to new, unseen data.
However, by introducing regularization with a strength of 0.02, the model’s
ability to generalize improves significantly. The training accuracy remains high,
close to 95%, while the validation accuracy also experiences a substantial boost.

4 Model Evaluation and Accuracy


After implementing regularization techniques to address overfitting, the perfor-
mance of the pneumonia detection model was evaluated. The evaluation pro-
cess involved assessing the model’s accuracy on both the training and validation
datasets.

4.1 Training Accuracy


The training accuracy refers to the accuracy of the model on the training dataset
during the training process. It provides insights into how well the model has
learned from the labeled data and how effectively it can classify pneumonia and
normal (non-pneumonia) cases from chest X-ray images.
During the training process, the model’s weights are updated based on the
training data, and it attempts to minimize the loss function by making accurate
predictions. The training accuracy is calculated as the proportion of correctly
classified samples from the training dataset.

4.2 Validation Accuracy


The validation accuracy, on the other hand, measures the accuracy of the model
on a separate dataset called the validation dataset. This dataset is not used
during the training process but serves as an unseen set of examples for evaluating
the model’s generalization performance.
The validation accuracy is essential for assessing whether the model can
generalize well to new, unseen data. If the validation accuracy is significantly
lower than the training accuracy, it may indicate overfitting, where the model
is memorizing the training data without generalizing well to new instances.

4.3 Accuracy Results


After training the model with different L2 regularization strengths (0.01, 0.02,
and 0.05), the following validation accuracies were obtained:

199
L2 Regularization Strength Validation Accuracy
0.01 67.63%
0.02 81.57%
0.05 69.39%

Table 1: Effect of L2 Regularization Strength on Validation Accuracy

Among the three regularization settings, L2 = 0.02 clearly outperformed the


others with the highest validation accuracy of 81.57% [?]. Therefore, L2 = 0.02
was selected as the optimal regularization strength for our CNN model, as it
demonstrated superior generalization to new, unseen chest X-ray images.
To illustrate, if the model correctly predicted 8,157 out of 10,000 samples in
the validation dataset, the accuracy would be calculated as follows:

Number of correct predictions


Accuracy = × 100
Total number of samples
8157
= × 100 = 81.57%
10000
With rigorous training and applying L2 regularization of 0.02, along with
optimized image dimensions of 250 x 250 pixels, the model achieved the following
accuracy:

Accuracy Value
Training Accuracy 93.96%
Validation Accuracy 81.57%

Table 2: Model’s accuracy for training and validation sets

These results showcase the model’s effective learning from the labeled data,
as evidenced by its high accuracy on the training dataset. Additionally, the
relatively high validation accuracy further demonstrates the model’s capability
to generalize well to previously unseen chest X-ray images.
With these promising outcomes, the Convolutional Neural Network (CNN)
model holds substantial potential for advancing pneumonia detection in medical
imaging, promising more accurate and reliable diagnoses in the field. These
results open new avenues for further research and application of the model in
real-world medical scenarios, bringing us one step closer to enhanced healthcare
outcomes.

10

200
5 Applications, Limitations, and Potential
Improvements
5.1 Applications of the Model
The developed Convolutional Neural Network (CNN) model for pneumonia clas-
sification using chest X-ray images has several potential applications in the field
of medical imaging. Some practical implications and potential uses of the model,
highlighting its significance in improving healthcare outcomes, are:
• Early Pneumonia Detection:
Timely and accurate diagnosis of pneumonia is crucial for effective treat-
ment and patient management. The CNN model can be utilized as a screen-
ing tool to assist radiologists and healthcare professionals in the early de-
tection of pneumonia [KC21]. By automating the classification process, the
model can expedite the identification of pneumonia cases, enabling prompt
intervention and reducing the risk of complications.
• Support for Clinical Decision-Making:
The CNN model can serve as an aid in clinical decision-making processes.
By providing an objective analysis of chest X-ray images, the model can
assist healthcare professionals in their diagnostic assessments [Sez23]. The
predictions made by the model can be used as a valuable reference, help-
ing physicians validate their initial interpretations and improve diagnostic
accuracy.
• Telemedicine and Remote Areas:
In remote areas or regions with limited access to healthcare facilities, the
CNN model can be employed as a diagnostic tool. By transmitting chest
X-ray images to a centralized location, the model can analyze and classify
the images remotely. This telemedicine application can bridge the gap
in healthcare services, providing access to expert opinions and facilitating
prompt diagnosis, even in underserved regions.
• Education and Training:
The CNN model can also be utilized as an educational tool for medical
students and healthcare professionals. By providing annotated predictions,
the model can aid in the learning process, allowing individuals to compare
their assessments with the model’s classifications. This interactive learning
approach can enhance the understanding of pneumonia patterns in chest
X-ray images and improve diagnostic skills.

5.2 Addressing the Model’s Limitations and Potential


Areas for Improvement
Although the Convolutional Neural Network (CNN) model shows promise in
pneumonia detection from chest X-ray images, there are important limitations

11

201
to consider for further improvement. Firstly, to enhance the model’s generaliz-
ability across diverse patient populations and imaging conditions, it is essential
to augment the dataset’s size and diversity.
To build transparency and trust with medical professionals, integrating ex-
plainable AI methods is crucial [MKR21]. By providing interpretive insights
into the model’s decision-making process, clinicians can better understand and
trust the predictions.
In the context of deploying the model in real healthcare settings, conducting
clinical validation studies is vital. Collaborating with medical experts and con-
ducting prospective studies can validate the model’s effectiveness, safety, and
practicality for real-world use. Clinical validation is essential to ensure that the
model’s performance aligns with medical standards and guidelines.
In light of the rapid advancements in AI technology, addressing the ethical
implications of deploying AI models in healthcare becomes paramount. Several
concrete steps can be taken to ensure the responsible and ethical integration of
AI. Transparent algorithm development is essential, requiring clear documen-
tation of the model’s decision-making process to foster understanding among
medical professionals. To mitigate potential biases, robust strategies must be
implemented, accompanied by regular evaluation across diverse demographic
groups [SW22]. Incorporating these measures not only promotes the trustwor-
thy adoption of AI in healthcare but also fosters a collaborative environment
where AI augments and enhances the capabilities of medical professionals, ulti-
mately contributing to improved patient care.
By addressing these challenges and areas of improvement, a more powerful
and reliable AI-assisted tool for pneumonia detection can be developed, signifi-
cantly impacting healthcare outcomes and patient care.

5.3 Can such a model outperform medical professionals?


Despite the impressive potential of the CNN model for pneumonia classifica-
tion using chest X-ray images, it is essential to recognize that it cannot replace
the expertise of medical professionals. Doctors and radiologists bring exten-
sive knowledge, experience, and clinical judgment, considering various factors
to make accurate and comprehensive diagnoses, including patient history, symp-
toms, physical examination, and additional tests.
The CNN model serves as a valuable tool to support medical professionals
by providing an objective analysis of chest X-ray images. It can assist in early
detection and screening, potentially expediting the diagnostic process and re-
ducing complications. However, it should always complement and enhance the
expertise of medical professionals, rather than replace their knowledge and clin-
ical judgment. Ultimately, the integration of AI in healthcare aims to empower
medical professionals and improve patient care while respecting the central role
of human expertise in diagnosis and treatment.

12

202
6 Conclusion
In the context of this data science project, a Convolutional Neural Network
(CNN) model was developed for pneumonia detection using chest X-ray im-
ages. Capitalizing on the capabilities of deep learning and advanced image clas-
sification techniques, this model exhibits substantial potential to aid medical
professionals in promptly and accurately identifying pneumonia cases.
Throughout this research, fundamental machine learning concepts were ex-
plored, encompassing supervised learning and binary classification. Through the
application of binary classification within the framework of supervised learning,
a robust and automated model was crafted, capable of effectively discerning be-
tween pneumonia and normal (non-pneumonia) conditions within chest X-ray
images.
The methodology included evaluating the model’s accuracy on both the
training and validation datasets, demonstrating the effectiveness of my ap-
proach. The model achieves high accuracy on both datasets, indicating its
ability to learn from labeled data and generalize to new instances, instilling
confidence in its practical usability.
While the CNN model showcases impressive performance, it was emphasized
that it is not intended to replace the expertise of medical professionals. The
expertise, experience, and clinical judgment of healthcare experts are irreplace-
able, and the model serves as an aid to complement their skills.
In conclusion, the CNN model for pneumonia detection represents a signifi-
cant advancement in the field of medical imaging. By fusing AI technology with
medical expertise, early and accurate pneumonia diagnosis can be achieved,
leading to improved healthcare outcomes and ultimately, saving lives.

7 Acknowledgements
I extend my sincere gratitude to Professor Guillermo Goldsztein for their exem-
plary mentorship and scholarly guidance throughout the course of this research.
Their insightful feedback and dedication to academic excellence have signifi-
cantly enriched the quality of this work.
I would also like to express my appreciation to Davida Kollmar and the Hori-
zon Academic Research Program for their invaluable support. The constructive
feedback and resources provided by this program have greatly contributed to
the refinement of this research.

13

203
References
[GWK+ 18] Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir
Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang,
Jianfei Cai, and Tsuhan Chen. Recent advances in convolutional
neural networks. Pattern Recognition, 2018.

[Int] An introduction to machine learning. https://monkeylearn.com/


machine-learning/. Accessed on 30 July 2023.

[Jap01] Nathalie Japkowicz. Concept-learning in the presence of between-


class and within-class imbalances. pages 67–77, Berlin, Heidelberg,
2001. Springer Berlin Heidelberg.

[KC21] Lingzhi Kong and Jinyong Cheng. Based on improved deep convo-
lutional neural network model pneumonia image classification. PloS
one, 16(11):e0258804, 2021.

[Kro08] Anders Krogh. What are artificial neural networks? Nature biotech-
nology, 26(2):195–197, 2008.

[KZG18] Daniel Kermany, Kang Zhang, and Michael Goldbaum. La-


beled optical coherence tomography (oct) and chest x-ray im-
ages for classification (2018). Mendeley Data, v2 https://doi.
org/10.17632/rscbjbr9sj https://nihcc. app. box. com/v/ChestXray-
NIHCC, 2018.

[Liu17] Danqing Liu. A practical guide to relu. start using and understand-
ing relu. . . — by danqing liu. https://medium.com/@danqing/
a-practical-guide-to-relu-b83ca804f1f7, 2017. Accessed on
30 July 2023.
[LLQ19] Jian Li, Xuanyuan Luo, and Mingda Qiao. On generalization error
bounds of noisy gradient methods for non-convex learning. arXiv
preprint arXiv:1902.00621, 2019.
[Mar] Brendan Martin. Binary classification – learndatasci. https:
//www.learndatasci.com/glossary/binary-classification/.
Accessed on 30 July 2023.
[MKR21] Aniek F Markus, Jan A Kors, and Peter R Rijnbeek. The role
of explainability in creating trustworthy artificial intelligence for
health care: a comprehensive survey of the terminology, design
choices, and evaluation strategies. Journal of biomedical informat-
ics, 113:103655, 2021.

[Pan19] Ayush Pant. Introduction to logistic regression. Towards Data


Science, 22, 2019.

14

204
[Pol23] Lurvı̈sh Polodoo. Pneumonia detection using cnn.
https://www.kaggle.com/code/lurvish12/
pneumonia-detection-using-cnn, 2023. Accessed on 5 Au-
gust 2023.

[RAS20] Andrinandrasana David Rasamoelina, Fouzia Adjailia, and Peter


Sinčák. A review of activation function for artificial neural net-
work. In 2020 IEEE 18th World Symposium on Applied Machine
Intelligence and Informatics (SAMI), pages 281–286. IEEE, 2020.

[Sez23] Emre Sezgin. Artificial intelligence in healthcare: Complementing,


not replacing, doctors and healthcare providers. Digital Health,
9:20552076231186520, 2023.

[Sof] TIBCO Software. What is a neural network? Accessed on 30 July


2023.

[SW22] Haytham Siala and Yichuan Wang. Shifting artificial intelligence to


be responsible in healthcare: A systematic review. Social Science
& Medicine, 296:114782, 2022.
[Wor22] World Health Organization (WHO). Pneumonia in children, 2022.
Accessed on 30 July 2023.

15

205
The Effects of Classical Music Intervention on the
Neuropsychiatric and Cognitive Mechanisms of
Alzheimer’s Disease Patients

Nyneishia Janarthanan
October 17, 2023

Abstract
Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder,
presenting a profound challenge to both neuropsychiatric and cognitive
well-being. As the sixth leading cause of death in the United States, AD
currently lacks a cure. This concurrent drawback sheds light into both the
pharmacological and nonpharmacological, therapeutic interventions that
could be incorporated into an AD patient’s course of treatment. Among
these is the transformative promise of classical music as a nonpharma-
cological mediation for AD patients. The exploration between classical
music and the neuropsychiatric and cognitive mechanisms of AD unveils
the effects of classical music on memory, spatial reasoning, depression,
sleep disorders, and other AD symptoms. Concepts such as Mozart’s ef-
fect offer a source of solace for improving the quality of life of individuals
diagnosed with AD. Moreover, the activation of the brain and the alter-
ation in various brain structures give rise to the diverse effects of classical
music in a healthcare and neurological setting.

∗ Advised by: Professor Arij Daou of the University of Chicago

206
1 Introduction
The implementation of classical music into human existence traces its origins
to the middle of the 18th century, serving not only as an art form but also
as a wordless language of its own. Nonetheless, classical music’s purpose has
evolved over the years in a plethora of ways: music as a pain reliever, the
beneficial role of music in exercise and sport, musical leisure activities in aging
rehabilitation, etc. However, one field that often remains overlooked in this
regard is the effect of classical music on the symptoms of Alzheimer’s Disease
(AD). Even though neuroscience remains to be a highly studied and researched
field, a plethora of questions with unknown answers arise from the topic of
music’s effect on the brain and limbic system of an AD patient. Alzheimer’s –
a neurological disease resulting from neuronal degeneration – ranks among the
leading causes of death worldwide. Its hallmark symptoms include memory loss,
cognitive decline, disorientation, aggression, depression, and a common inability
to perform everyday tasks. The likelihood of acquiring this harmful disorder is
steadily increasing and is expected to worsen tremendously in the future.

Figure 1: Worldwide Projections of AD Prevalence, 2005-2050 [D.16].

In 2020, over 40 million individuals worldwide and nearly 6 million Ameri-


cans contracted AD [D.16]. These staggering figures demonstrate not just the
national, per-country threat, but also the global importance and distressing
impact of this disease [C.18a]. Offering a glimmer of hope amid the bleak
landscape of cognitive decline, research and the associated body of knowledge
indicates that classical music acts as a medium that transcends the memory loss
associated with AD. In fact, classical music therapy functions as a treatment
modality by improving learning, communication, mobility, and other mental
and physical functions [J.17]. The aforementioned skills are all severely affected
along the course of a patient with AD or other forms of dementia. This research
paper aims to highlight the overall advantages and benefits derived from the
application of classical music interventions to improve an AD patient’s quality
of life. Moreover, such a healthy inclusion possesses the potential to generate
viable results during the pathogenesis and treatment of AD.

207
2 Alzheimer’s Disease
Alzheimer’s Disease (AD) – currently ranked as the sixth leading cause of death
worldwide – represents a progressive neurodegenerative disorder. It lacks a
definitive cure and primarily impacts a patient’s cognitive functions such as
memory, behavior, and thinking. AD stands as the most prevalent form of
dementia, which is characterized by a gradual decline in two or more domains
of cognition such as memory, language, behavior, and executive function [A.18].
The presence of neuritic plaques and neurofibrillary tangles are AD’s hallmark
indications – elements measured throughout the progression and regression of
any disease [R.20]. This condition was first comprehensively described by Alois
Alzheimer in 1906 as a “peculiar severe disease process of the cerebral cortex”.
Healthcare costs for AD are estimated to be approximately $500 billion yearly,
ranging from the necessity for treatments to routine checkups.

2.1 Etiology of Alzheimer’s Disease


The etiological pathway – or the set of common causes for AD – divides into
both genetic and environmental factors. The predominant set of genetic risk
components of AD that will be discussed include age, the presenilin mutation,
Down Syndrome or Trisomy 21, and gender. A genetic factor is defined as
a component that increases the likelihood of developing a particular disease
depending on an individual’s genetic makeup. Unquestionably, the greatest
genetic risk factor for AD is advanced age, typically after the age of 65 [J.15].

Figure 2: Projected Number of People Aged 65 or Older With Late-Onset


Alzheimer’s Disease, by Age Group, US, 2010-2050 [A18A 182A 18 ].

Late-onset or sporadic Alzheimer’s is the most common type of AD; signs


begin to appear promptly after a person’s mid-60s. Figure 2 demonstrates the
prevalence of contracting late-onset AD after the age of 65. This phenomenon
is relatively frequent and is estimated to increase in the mere future. On the
other hand, it is important to note that early-onset or familial AD is relatively
rare and is usually caused by gene changes passed down from a parent to their
child. Signs first appear between an individual’s 30s and mid-60s. In familial

208
AD, nearly half of the cases are due to mutations in three genes: amyloid pre-
cursor protein (APP), Presenilin-1 (PSEN1) and Presenilin-2 (PSEN2) [J.15].
Presenilin (PSEN) mutations will be discussed in much more detail later. It is
crucial to understand that findings in early-onset familial cases will translate to
the sporadic (no specific family link) late-onset AD.
Aging is the main risk factor for AD that simply cannot be explained by
the popular amyloid hypothesis theory, which asserts that the amyloid-beta
plaques are the major highlight of this disease. Yet, an alternate perspective
to aging and AD is strongly related to the APOEE4 allele, which remains as
the most robust genetic risk factor for sporadic AD. In AD, the risk conferred
by APOEE4 is mostly observed in the 61-65 age group, which supports the
statement that symptoms of late-onset AD first appear around 65 years of age
[R.97]. One copy of APOEE4 is carried by approximately 25% of individuals,
but inheriting this gene does not indicate that a person will surely develop AD
[212 1702 1 ]. It is important to note that the APOEE4, APOEE3, and APOEE2
alleles all play a significant role in the onset and progression of AD, but the
APOEE4 vastly increases the risk of the disease compared to its counterparts
[G.22]. Moreover, the APOE genotypes’ pathogenesis has been researched way
beyond just amyloid-beta plaques and the Tau neurofibrillary tangles, providing
potential answers to the age-related progression of AD [T.21]. For one, APOEE4
is associated with not only AD but also other symptoms and diseases such as
age-related cognitive decline and Lewy Body Dementia (LBD) [G.22]. Secondly,
the APOE Cascade Hypothesis connects the dots between an increased risk of
AD and aging by stating that the biochemical and biophysical characteristics
of APOEE4 at a cellular level cause a multitude of downstream effects observed
in AD [T.21].

Figure 3: APOEE4 Cascade Hypothesis Demonstrated Through 4 Phases [G.22].

209
The cascade – or the successive progression of APOEE4 – begins at the
biochemical and cellular phase as demonstrated in Figure 3. Properties of the
allele such as lipidation and receptor binding have harmful impacts on some cell
processes, which could accumulate into cellular stress and eventually lead to the
onset of age-related cognitive decline and AD. Aging and AD are interconnected
but are distinct in nature. The number of neurons do not severely increase or
decrease in aging, but neuronal and synapse loss is a key indication of AD.
Nevertheless, aging and the increased risk of contracting AD with the APOEE4
gene is a predominantly researched topic within the field, proving to hold a
major connection to sporadic AD.
Another genetic factor for AD that is widely discussed is the PSEN1 gene
mutation, encoding the Presenilin-1 (PS1) protein. In early-onset AD or famil-
ial Alzheimer’s Disease (FAD), PSEN1 mutations account for nearly 90% of all
mutations recorded in FAD, illustrating the significance of this gene and its pro-
tein products [rSJ17]. The presenilin hypothesis proposes that these deleterious
mutations result in a decrease of the needed presenilin functions in the brain,
triggering both neurodegeneration and dementia in FAD [rSJ17]. Another com-
ponent to discuss – with regard to PSEN1 and PS1 – is the Y-secretase enzyme
whose catalytic subunit is PS1. More specifically, Y-secretase cleaves differ-
ent types of transmembrane proteins attached to the plasma membrane of a
cell, which includes the amyloid precursor protein (APP) – a central element
of AD. Y-Secretases produce two types of amyloid-beta proteins in AD: AB42
and AB40. The only difference between the two is that AB42 has two extra
residues at its C-terminus end [Z.13]. It has been proposed in the past that
AB42 and AB40 are heavily responsible for AD since they accumulate into one
of the hallmark pathological indications for this disease – AB plaques. However,
PSEN1 mutations did not increase both of the proteins; instead, the proteins
both decreased in number (especially AB40) which elevated the AB42/AB40
ratio [rSJ17]. This AB42/AB40 ratio is a useful diagnostic marker of AD since
the ways in which PSEN1 mutations affect APP and AB-plaques is complex
and not yet properly acknowledged [C.06]. It is clear that Y-secretases produce
the final AB proteins involved in AD, but they also regulate Notch signaling,
which regulates cell proliferation, cell fate, differentiation, and cell death [R.12].
Therefore, pharmacological interventions such as drug therapy attempt to alter
AB protein production without interfering with Y-Secretases’ ability to perform
Notch signaling [S.12].
The next genetic factor for AD that will be discussed is Down Syndrome
(DS) or Trisomy 21: a genetic disorder caused by the presence of an extra
copy of Chromosome 21 or a part of it. It is distinguished based on cran-
iofacial abnormalities, heart defects, cognitive impairments, and neurological
alterations [M.20]. With over 200,000 cases in the United States alone, DS
is one of the leading genetic risk factors for FAD. Furthermore, clinical and
biomarker changes in DS associated FAD demonstrate that many of the same
cortical regions are affected in both diseases such as the hippocampus and the
prefrontal cortex [M.21a]. As they progress, both diseases share similar cellular
dysfunctions such as impaired autophagy, reduced and/or damaged lysosomal

210
activity, and mitochondrial dysfunction [M.20].

Figure 4: Pathological Indications of DS Common in AD [E.19].

By the age of 40, NFT and AB accumulation are present in the brains
of individuals with DS, which is sufficient enough to confirm a pathological
diagnosis for AD [C.18b]. In Figure 4, evidence demonstrates that progres-
sive brain inflammation can emerge as early as the late teenage years in DS
based on recorded intracellular accumulations of AB. The early appearance
of AD’s hallmark indications in individuals with DS can be explained by the
presence of neuron-derived exosomes, which are tiny extracellular vesicles that
contain elevated levels of both AB peptides and the hyperphosphorylated Tau
protein [C.18b]. Since exosomes are blood biomarkers, their progression and de-
velopment can be monitored, which informs future AD diagnostics, preventions,
and potential treatments in the DS population as well as the general population.
Finally, sex is an important genetic risk factor for contracting AD, with al-
most two thirds of the late-onset AD population being women [dLMJBRD18].
It cannot simply be stated that women are more likely to develop AD since they
have a greater life longevity compared to men. This is because AD pathology
starts many years prior to the appearance of most clinical symptoms [L.18].
However, there is increasing evidence that the perimenopause to menopause
transition (PTMT) – a midlife neuroendocrine transition specific to women – is
heavily responsible for the sex-observed pathophysiological mechanisms underly-
ing AD [dLMJBRD18]. PTMT is strongly neurological in nature; it disrupts and
alters the systems and mechanisms regulating estrogen and impacts thermoreg-
ulation, circadian rhythm, sleep, depression, and even cognition [dLMJBRD18].
During PTMT, estrogen, progesterone, pituitary, hypothalamic, and ovarian
hormone levels fluctuate and decrease. Estrogen, specifically, is unique to fe-
males and is found in a plethora of areas in the brain controlling memory and
cognitive function, indicating its neurological significance [E.18]. When the
brain’s estrogen network disconnects from other brain areas, the resulting hy-
pometabolic state serves as a major site for neurological dysfunction [L.18].
In fact, perimenopausal (PERI) and postmenopausal (MENO) women show
major declines in estrogen-dependent memory tests compared to men, which

211
is the first indication that PTMT can trigger cognitive decline in the female
population [dLMJBRD18]. Secondly, the MENO and PERI groups disclosed
higher rates of cerebral metabolic rate for glucose consumption (CMRglc) de-
cline compared to males and premenopausal (PRE) women. Glucose is neces-
sary to provide the precursors for neurotransmitter synthesis and fuel adenosine
triphosphate (ATP) production, which is the source of energy and storage at the
cellular level [A.13]. With a noticeable decrease in glucose levels, the neurologi-
cal workings of the body are severely disrupted in PERI and MENO individuals.
In essence, decreased estrogen levels and the deterioration of the pathway that
affects CMRglc explain the higher percentage of women developing AD.
In addition to genetic risk factors, various environmental, predisposing con-
tributors pertain to AD such as Type 2 diabetes (T2D)/Type 2 diabetes mellitus
(T2DM), obesity, and cerebrovascular disease.
Firstly, the interplay between diabetes, obesity, and AD highlights the com-
plex relationship between lifestyle factors and the risk of cognitive decline. It
should be noted that obesity is characterized by an excessive accumulation of
body fat, which is measured using the Body Mass Index (BMI). Obesity can,
in turn, trigger the development of T2D, and the risk of acquiring this disease
linearly grows with an increase in BMI [E.22b]. T2D – representing 90-95% of
diabetic cases – can be defined as a disease affecting metabolic activity, char-
acterized by the presence of chronic hyperglycemia due to pancreatic cell fail-
ure [C.21]. Just by itself, hyperglycemia or high blood glucose, can contribute
to molecular, biochemical, and histopathological lesions in AD [ML14]. Yet, the
main focus when researching the connection between T2D and AD is insulin re-
sistance – the body’s reluctance to the insulin hormone, subsequently resulting
in an increase of blood sugar. The hyperglycemic status of T2D patients due
to insulin resistance affects neuronal homeostasis and affects K-ATP channels,
which increases AB peptide levels [C.21]. Also, an increased level of glucose
in the blood and the dysregulation of glucose molecules drives an unregulated
non-enzymatic reaction between many carbohydrates (such as sugars) and lipids
and between free amino groups (-NH2) of several proteins and nucleic acids,
which results in advanced glycation end-products (AGEs) [C.21]. High levels of
AGEs elicit inflammatory reactions in the brain and develop symptoms leading
to poorer memory and higher hippocampal levels of insoluble AB42 [M.16b].
AGEs promote AB plaques and neurofibrillary tangle formation more in AD
patients with T2D than in non-diabetic AD patients [C.21]. The two main
hallmark indications of AD – AB plaques and neurofibrillary tangles – will be
discussed in detail in the pathology section.
Another environmental risk factor for AD is cerebrovascular disease (CVD)
– a type of cardiovascular disease that harms the blood vessels supplying the
brain. CVD is the most frequent type of life-threatening injury to the brain
and is the fifth most common cause of death. CVD and AD share many of
the same risk factors such as the APOEE4 gene, T2DM, obesity, and age [S.16].
These account for some of the genetic and environmental risk factors of AD pre-
viously discussed, which demonstrates that CVD’s origin is pathologically and
environmentally similar to that of AD. AB plaques in AD accumulate in the

212
extracellular part of a neuron; in cerebral arterioles and blood vessels supplying
to the brain, AB builds up in the capillaries of CVD patients. Most AD patients
have AB angiopathy resulting from CVD, which predominantly affects the cere-
bral leptomeninges, cortex, cerebellum, and the brain stem [S.16]. The capillary
AB angiopathy is detected in almost 35-45% of AD cases, which provides robust
evidence supporting the hypothesis that CVD can contribute to the symptoms
distinct to AD due to the synergistic relationship of the diseases [G.21].

2.2 Pathology of Alzheimer’s Disease


The principal pathological indications of AD include the presence of amyloid
beta (AB) plaques, neurofibrillary tangles containing an aggregation of the Tau
protein, neuroinflammation, and oxidative stress [S.18]. The entirety of neurode-
generative diseases involves the eventual degradation of neurons in the brain,
and this trend is especially evident in the progression of AD. The exact path
in which neuronal death occurs is obscure, yet there are many theories driven
around the same question. Apoptosis, the process of programmed cell death
to eliminate unwanted cells, is the most extensively studied topic regarding
neuronal loss in AD due to its unique cellular nature. Yet, the AB peptide
known to be responsible for driving neuronal apoptosis is not recorded in many
post-mortem tissue specimens of AD patients [R.22]. Even though there are
other researched explanations for neuronal death such as necrosis, necroptosis,
and/or pyroptosis, the pathological mechanisms underlying neuronal death and
dysfunction in AD continue to elude full comprehension. Nevertheless, the loss
of neurons undoubtedly constitutes the basis of progression for this disease.
Regarding neuronal loss, there has been an elevated degree of focus on cholin-
ergic neurons that release the neurotransmitter acetylcholine (ACh), which has
a crucial role in both the peripheral and central nervous system [M.16a].

Figure 5: The Cholinergic Hypothesis & Release of ACh [J.03]

The alterations in choline uptake, negatively affected ACh release, deficits


in the nicotinic (nAChR) and muscarinic (M2AChR and M1AChR) receptors
and their functions, and deficits in transport through the axon are exhibited
in the early AD neuron above in Figure 5. The decrease in the number of

213
symbols and the reduced color intensity in the legend illustrate that many pro-
teins, enzymes, and essential cell structures in AD turn defective or are missing
all together, which essentially kills the whole neuron. Nearly all brain regions
are innervated by cholinergic neurons, which are responsible for processes re-
lated to learning, memory, and attention. The progressive degeneration of the
basal forebrain cholinergic neurons (BFCNs), for instance, is correlated with the
harmful symptoms and memory deficits that AD impedes on a patient [M.16a].
In fact, BFCNs provide the main cholinergic information to prefrontal cortices
in the brain along with other crucial structures such as the amygdala (respon-
sible for evoking emotions) and the hippocampus, which plays a significant role
in long term memory function and memory consolidation [M.22]. Patholog-
ically, there have been major depletions of the cholinergic synthetic enzyme
named choline acetyltransferase (ChAT) and the cholinergic hydrolytic enzyme
acetylcholinesterase (AChE) in and around BFCNs [MM21]. ChAT is intercon-
nected with the process of synthesizing or polymerizing ACh, whereas AChE
breaks down ACh into its component parts. Together, these enzymes along
with the ACh neurotransmitter play an essential role in the nervous system
due to their ability to regulate cell signaling and host effective communication
amongst neighboring neurons. A dramatic loss of ChAT and AChE activity in a
considerable number of AD cases strongly supports the claim that the BFCN’s
degeneration is a strong foundation for the cholinergic theory of AD.
Onto the more prominent hallmarks of AD (ones that contribute to the loss of
neurons and their synapses) are the extracellular neuritic plaques containing the
amyloid-beta (AB) protein and the intracellular neurofibrillary tangles (NFTs)
carrying the hyperphosphorylated Tau protein.

Figure 6: AB plaques and NFTs in Healthy vs. Affected Neuronal Cavity


[A15A 153A 15 ]

In a healthy neuron, there are visibly almost no AB plaques in the exterior of


the cell, and there are no NFTs present either. This allows for the cell to effec-
tively communicate with surrounding neurons and also operate intracellularly.
On the other hand, the neurons in a brain with Alzheimer’s contain AB plaques
surrounding the cell, and NFTs present in the soma/cell body of the neuron.

214
As a result, neuronal function is disrupted, which eventually contributes to the
cell’s death.

Figure 7: A schema of amyloid precursor protein (APP) cleaved to form AB


plaques [O.14]

It is important to note that AB is a regular peptide produced in the body, but


AB plaques – specifically – indicate a neuropathological hallmark of AD [O.14].
The precursor protein (APP) – located on a cell’s plasma membrane – under-
goes cleavage by B-secretases and y-secretases to produce insoluble AB fibrils.
All mutations in APP linked to AD are within the AB peptide, B-cleavage,
or y-cleavage sites. As seen in Figure 7, APP is cleaved by the B secretase
to form C-terminal fragment B (B - CTF) and then cleaved once more by y-
secretase to produce AB – the prime component of the AB plaques seen in
Alzheimer’s. In the non-amyloidogenic pathway or the way in which AB for-
mation is considered less toxic than the amyloidogenic pathway, the a-secretase
creates less-aggregated forms of AB that are less likely to form extracellular
plaques. The major difference is where the cleavage occurs and which enzyme
carries out the task that indicates whether the AB protein is toxic or non-toxic.
In AD, toxic AB fibrils diffuse into the pre-synaptic clefts and interfere with
the necessary cell signaling [M.19b]. As a result, the AB fibrils that cannot dis-
solve in water aggregate into the plaques that are present in AD. Furthermore,
the polymerization of the AB protein into long strands of polypeptides leads to
the activation of a group of enzymes named kinases, which hyperphosphorylate
the microtubule-associated Tau protein. The gradual build-up of Tau eventually
leads to the formation of intracellular NFTs that disrupt neural communication.
The correlation between AB plaques and the aggregation of Tau has dominated
AD research as the “Amyloid Cascade Hypothesis”. This states that the abnor-
mal build-up of AB is the primary event in AD, triggering Tau pathology and
subsequently neuronal death [O.18].
Another consequence of the accumulation of AB plaques and the Tau associ-
ated NFTs is the increased production of reactive oxygen species (ROS). Since
oxygen is an extremely electronegative element, it readily accepts free flow-
ing electrons generated by mitochondrial oxidative metabolism, which produces

10

215
ROS. These species range from various types of anions to hydrogen peroxide,
which both have unpaired valence electrons. Such ions and compounds will
readily accept an electron to transition into a stable state with a full octet sur-
rounding their outer shell. However, until they reach stability, ROS at high con-
centrations will react almost immediately with the four major macromolecules
in the body: lipids, proteins, carbohydrates, and nucleic acids [KH12]. On one
hand, ROS are crucial in physiological processes such as redox regulation and
transcription of DNA. Nonetheless, they may also induce undesirable effects and
even irreversible outcomes such as the aggregation of Alzheimer’s.

Figure 8: Levels of ROS and the Resulting Effect [KH12]

When ROS levels substantially decrease or increase past their optimal level
of appearance, there can be dangerous consequences such as lack of signaling
when they increase or overshoot signaling (signal is sent exceeding its target)
when they decrease. In the context of AD, increased production of ROS by the
interrelation between AB plaques and NFTs raises the risk of oxidative stress:
a condition caused by the imbalance of ROS in cells and tissues, impairing the
body’s ability to detoxify these reactive substances [A.17]. Elevated levels of
ROS leading to oxidative stress can exacerbate age and disease-dependent mi-
tochondrial dysfunction, reduce antioxidant defences around synaptic activity,
and disrupt neuronal cell signaling, ultimately leading to cognitive dysfunc-
tion [E.17]. Even in the absence of ROS, AB plaques and high levels of Tau can
independently worsen mitochondrial dysfunction and interrupt cell communica-
tion, which inevitably contributes to a perilous cycle in the body with disruptive
homeostatic control.
The intricate pathology of AD, in summary, is marked by hallmark features
of neuronal and synaptic loss, AB plaques, NFTs, ROS, and oxidative stress.

11

216
The interconnectedness between these factors wholly contributes to the cognitive
and behavioral decline associated with AD. Ongoing research continues to shed
light on novel aspects of AD pathology, which holds a degree of promise for
lessening the impact of this neurological disorder.

2.3 Neuropsychiatric and Cognitive Symptoms of AD


As Alzheimer’s progresses, the burden of both neuropsychiatric and cognitive
symptoms can pose significant challenges for both AD patients and caregivers.
Neuropsychiatric symptoms (NPS) are non-cognitive disturbances that pertain
to mental health defects and psychiatric disorders. NPS are interconnected
to both neurological (brain-related) and psychiatric (mental-health related) as-
pects. On the other hand, cognitive symptoms refer to anything harmed in
the brain processes involving thinking, learning, problem-solving, understand-
ing, and decision-making. Memory, attention, language, executive functions,
perception, and reasoning are just a few concepts that fall under the umbrella
of cognitive processes. The most common forms of NPS in AD are apathy,
depression, aggression/agitation, and sleep disorders.
To begin, apathy is the most persistent and frequent NPS recorded in all
the AD stages [S.11]. This symptom is characterized by a lack of motivation for
goal-directed actions and cognitive activity [G.06]. In AD, the primary psychi-
atric correlate of apathy is depression, but depression is neither necessary nor
sufficient to produce apathy; 94% of AD patients with some form of depression
had no apathy [G.06]. When accounting for other NPS, apathy is difficult to
isolate from the various symptoms of dementia, but there are a few neuroimag-
ing techniques that could generate some answers. Many studies hint towards
the involvement of prefrontal dysfunction and deficits within frontostriatal cir-
cuits [M.18]. Areas within these circuits such as the anterior cingulate cortex
(ACC), prefrontal cortex (PFC), and sections of the basal ganglia play a pivotal
role in the progression of apathy in AD [M.18].
Depression trails behind apathy as the second most commonly encountered
NPS in AD patients. Depression can be identified as an affective disorder con-
taining bits and pieces of despair, apathy, insomnia, unwillingness, anxiety,
incompetence, fear, gloominess, and sadness [C.19]. The additional stress im-
plemented by depression in the AD population can detrimentally decrease a pa-
tient’s quality of life. As a matter of fact, depression occurs in an astonishingly
substantial percentage of almost 20-30% of AD patients. Women are expected
to experience a much higher depression rate (almost two times more than men)
when contracting AD, indicating that gender plays a role in the depressive state
of a patient [C.19]. It is important to note that cognitive dysfunction may seem
like depression because the two are quite commonly confused with each other.
Even though it is difficult to capture a definitive diagnosis for depression in AD,
both disorders influence each other through several, overlapping pathological
features. For instance, aberrations in the cholinergic transmission are present
in both AD and depression, which fuses together a pathophysiological intersec-
tion [C.19]. As previously discussed, the cholinergic hypothesis of AD proposes

12

217
that the deterioration of cholinergic neurons – that release the AcH neuro-
transmitter – is primarily responsible for memory loss and learning deficits. In
addition, the ability of the cholinergic system in triggering depression has also
been suggested in clinical studies nearly 50 years ago [S.19a]. With respect
to changes in the cholinergic system, the hippocampal region is the crossroad
where cognitive deficits meet with depressive manners [E.22a].

Figure 9: Cholinergic Alterations in Depression and AD [C.19].

Decreases in the cholinergic innervations of the brain reduce hippocampal


neurogenesis and function, which can result in depression. The chances of this
occurring are much higher for an AD patient compared to the general population
though.
Aggression/agitation in AD often stems from communication difficulties,
which can be properly addressed with appropriate intercessions. To begin, ag-
gression exists in a wide range of forms such as defensive (fear-induced), preda-
tory, dominance, inter-male, maternal, isolation-induced, irritability-associated,
etc. [EI.17]. In order to alleviate these types of symptoms in an ethical man-
ner, medication use and physical restraint are advised by doctors and various
healthcare providers. However, antipsychotic medications are associated with
adverse side effects such as an increased stroke risk, and physically constrict-
ing an AD patient induces negative psychological and physical effects [S.19b].
Even though the molecular mechanisms in functional and pathological agita-
tion in AD remain incompletely understood, promising scientific initiatives are
frequently conducted. One past example includes the definitive explanation of
the comprehensive spectrum of brain abundant neurotransmitters that appear
to have both triggering and preventing impacts on aggressiveness [EI.17].
Lastly, sleep disorders are typical symptoms of AD that appear early on in
the disease, which is quite distinct from other symptoms such as depression and
apathy that appear later on. There is a wealth of evidence available to support
the claim that sleep quality and duration are crucial to consolidate memory
for future retrieval and to remove the build-up of AB and hyperphosphorylated
Tau in AD patients’ brains [A.20]. Individuals with AD that struggle with a
normal sleep schedule exhibited a major alteration in the sleep/wake cycle, with
a growth in the number of nighttime awakenings and an increase in disturbances

13

218
of nocturnal sleep [A.20]. Sleep disorders such as sleep breathing disorders
and restless leg syndrome negatively alter circadian fluctuations of AB in the
interstitial brain fluid and cerebrovascular fluid (CVF) related to the production
of AB plaques [G.18]. Such sleep abnormalities evoke the increased production
of the pathological Tau protein and AB plaques. In order to aid AD patients
in adopting healthy sleep patterns, the development of specific procedures to
improve sleep structure and quality are being expanded on constantly. There
are a plethora of NPS associated with AD, and there is extensive, favorable
research being done on alleviating these symptoms.
Conversely, the major cognitive symptom in Alzheimer’s is memory loss.
If asked to name a disease that affects memory, most doctors would probably
choose Alzheimer’s. The six major memory systems include episodic, seman-
tic, simple classical, procedural, working, and priming memory. Of these ma-
jor categorizations, the deterioration of episodic memory is the most clinically
abundant cognitive symptom in AD [E.08]. Episodic memory is utilized when
consciously recalling a particular episode in one’s life, such as watching a movie
with a family member. Dangers arise from a loss in episodic memory when
AD patients forget if certain medications have been taken or even if the stove
is turned off or not [E.08]. Working memory and long-term, explicit memory
are impacted early in the course of this disease [H.13]. The first brain lesions
unique to AD appear in the poorly myelinated limbic neurons in areas affecting
memory, such as the hippocampus. For instance, hippocampal volume reduces
from 2.5 mL to almost 1.6 mL in the brain of an AD patient, especially as the
disease progresses [H.13]. Even with the inclusion of various AD criteria and
hippocampal biomarkers, there remain several barriers in neurological testing
for memory loss in AD.
Other cognitive symptoms of AD are impaired problem solving and language
levels. Something as simple as following a recipe or even paying the bills can
occur as AD worsens. Language impairments are caused by a decrease of soci-
olinguistic aspects such as the meaning of words, difficulties with fitting a word
and phrase into a situation, and word comprehension [K.15]. These two preva-
lent cognitive symptoms appear early on in AD, so they are used occasionally
to help diagnose a patient with AD.

3 Music and the Brain


The exploration of non-pharmacological interventions for AD has garnered sig-
nificant attention in recent times; however, promising avenues have been un-
deremphasized or overlooked. Cognitive decline, NPS, and memory impairment
present a pressing necessity for a quick and efficient way to mitigate AD’s symp-
toms and risks. Amidst this scholarly discussion, the interplay between music
and AD has emerged as a potential candidate to enhance the quality of life of
an Alzheimer’s patient. This section solely focuses on the relationship between
music and the brain.

14

219
3.1 Observed Effects of Music on the Brain
The ability for humans to perceive and enjoy music is a universal trait that
originated centuries ago and is still carried with us. Music is one of the most
powerful and diverse sensory, cognitive, and emotional experiences [T.17]. Our
brains light up when interpreting and perceiving music, and our bodies respond
in several ways, reflecting the powerful connection between music and our well-
being. Not only does music reduce feelings of separation and loneliness, it
also evokes cherished memories and maintains self-esteem, competence, and
independence [T.17]. From a cognitive standpoint, music boosts communicative
abilities, memory, self and environmental presence, and verbal and non-verbal
expressions [M.21b]. The improved projection of all these skills originates from
the organ responsible for formulating the very essence of what a human being
is: the brain. On top of encoding music, various parts of the brain are also
engaged based on the type of music traveling from the auditory cortex to the
brain’s nerve signals. This concept is demonstrated below in Figure 10.

Figure 10: Brain Areas Engaged Based on Emotion Category of Music [P.19].

Different brain areas are engaged according to the emotion category of music
in different colors – joyous music (red), tense music (yellow), and sad music
(blue). Based on conclusive results and statistically significant data, the type
of musical input is allocated to different parts of the brain connected by a
bilateral fronto-parietal network [P.19]. From a structural, cross-section view of
the cerebral cortex, it is clear that music in the brain can alter our perception
and emotional response based on the area it activates. For example, music with
a fast tempo and a major mode tend to evoke a positive/happy response, but a
slow tempo induces a negative/sad mood. The functional standpoint of listening
to music over time is an increase in the brain’s alpha waves that are associated
with relaxation and a calm state of mind [V.18]. The brain’s alpha waves are
also robustly responsible for human cognition and emotions, which generates
distinct physiological and psychological effects on the body [V.18].
Additionally, music causes the release of certain neurotransmitters, which
evokes important emotions, memories, and feelings. For instance, dopamine
is released in the mesolimbic reward system while listening to music, which
increases the body’s natural reward sensation. Serotonin, a neurotransmitter
involved in mood regulation and learning, also increases in the presence of audi-

15

220
tory stimuli. Higher concentrations of dopamine and serotonin in the caudate-
putamen and nucleus accumbens (areas linked to reward and motor control)
signify that music has a direct impact on the synaptic activity of these brain
areas and the amount of neurotransmitters released in a healthy manner.

3.2 Musical Activation of Various Brain Areas


Many structures are engaged in the process of musical activation, which is the
stimulation of the brain’s cerebral cortex into a state of alertness/attention.

Figure 11: Regression Analysis Correlation (rCBF) of Neuroanatomical Regions


Scanned After Exposure to Music [J.01].

Through the help of Magnetic Resonance Imaging (MRI) scans, Figure 11


depicts various brain parts engaged due to musical activation. These include
the left dorsomedial midbrain (Mb), bilateral cerebellum (Cb), right thalamus
(Th), left ventral striatum (VStr), and the hippocampus/amygdala (H/Am).
The amygdala, especially, plays an important role in the body’s perception of
music. As a part of the brain’s limbic system, the processing for emotions, fear,
and aggression originate from the amygdala – a small, almond-shaped struc-
ture. The amygdala’s gray matter volume (GMV) is directly correlated with an
individual’s melodic interval perception, which is the time taken to differentiate
two successive events [J.14]. In music, interval perception refers to how we per-
ceive the pitch gap between two notes played successively. This suggests that
the optimal GMV of the amygdala is an integral part of both perceiving and
processing the emotions generated by music. To further support this claim, peo-
ple with a decreased amygdala GVM portrayed impaired emotional responses
to music such as not recognizing sad or fearful music and not demonstrating a
sense of pleasure when listening to pleasant music [J.14].
Another structure of the limbic system is the hippocampus, which is also
activated by the presentation of music. This structure is fundamental in the
consolidation of both short-term and long-term memory and is involved in state

16

221
regulation, motivation, defensive behavior, and anxiety in response to specific
stimuli [D.06]. Typically, the hippocampus is associated with the processing
of unpleasant (permanently dissonant) music compared to pleasant (consonant)
music [D.06]. Just how some brain structures can perceive components like
rhythm and pitch, the hippocampus recognizes harmonious sounds and sepa-
rates them from unstable and tense sounds.
To add, more neuroimaging studies indicate that an overlap in musical ac-
tivation occurs in the superior temporal gyrus (STG), middle temporal gyrus,
middle frontal gyrus, parietal lobe, supplementary motor area, and premotor
cortex [X.19]. During music listening, both the left and right brain hemispheres
are activated, and the right temporal cortex is even involved in the perception
of pitch patterns [X.19]. Clearly, the underlying workings of music is a neu-
rologically ubiquitous process with differing structures involved in an attentive
brain.

3.3 The Difference Between Classical and Non-classical


Music
There has been a widespread interest in music’s effect on the brain, and there
exists a plethora of beneficial reasons why music is nature’s own “medicinal
treatment”. Nevertheless, is this due to music as a whole or – more exactly – a
type of music? This section will analyze the differences among classical music
and non-classical music and determine which one reaps the most advantages for
the brain and the body.
One concept that delves into classical music’s influence on the brain is the
Mozart effect. This phenomenon is observed as an improvement in certain
brain functions and abilities from repeatedly listening to classical pieces com-
posed by the popular 19th century musician, Wolfgang Amadeus Mozart. In a
study where subjects listened to Mozart’s Sonata for 2 Pianos, it was concluded
that participants demonstrated significantly greater spatial-reasoning skills com-
pared to periods of listening to relaxation instructions attempting to decrease
blood pressure [S.01]. Spatial reasoning is simply the ability to comprehend and
effectively draw conclusions. Sonata for 2 Pianos is also proven to drastically
increase relative alpha band power, which increases the brain’s alpha wave pat-
terns [R.18]. Once again, an increase in alpha wave patterns is associated with
an alert and relaxed state of mind.
Nevertheless, Mozart’s music is not the only form of classical music that
demonstrates these improvements in brain functions. In one study, after lis-
tening to either Gustav Mahler’s Adagietto Symphony 5, white noise, or no
music, participants’ semantic memory recollection increased when listening to
the classical piece compared to the white noise and no music [E.14]. Semantic
memory is a type of long-term memory involved in remembering words, con-
cepts, and permanent knowledge such as languages. The mean values for the
cognitive task associated with semantic memory was 39.90 for the Mahler group
but 36.39 and 38.34 for the white noise and no music groups, respectively [E.14].
The higher value for the Mahler group indicates that classical music has its own

17

222
set of benefits that cannot be derived from other types of music. Furthermore,
music with a long-term periodicity, whether of Mozart or other classical com-
posers, resonates within the brain to enhance spatial-temporal performance and
even decrease seizure activity [S.01]. For instance, Greek-American musician
Yanni’s compositions – similar to those of Mozart’s Sonatas in tempo, melody,
harmony, and structure – were also effective and reproduced the exact results
that improved cognitive abilities like reasoning and memory [S.01]. However,
the effects of music may not be dependent on a specific piece. Even though ro-
bust evidence demonstrates classical music’s ability to improve cognitive skills
such as memory and learning, music that is personally liked by subjects turns
out to enhance alpha wave and beta wave frequencies in the temporal brain
regions as well [R.18]. Therefore, non-classical music such as rock, pop, hiphop,
rhythm and blues, and even jazz could potentially generate the same results;
this would greatly depend on the preferences of the individual though.

3.4 The Purpose of Music Therapy


Music Therapy (MT) is an art-based intervention which utilizes music expe-
riences within a therapeutic manner to address patients’ physical, cognitive,
emotional, and social needs [M.19a]. Both participants and music therapists
interact within a structured framework, which concludes in conversations about
the patient’s emotions and/or experiences [C.17]. MT is known to alleviate a
patient’s symptoms and improve their quality of life especially in fields of health
care such as neonatology, neurodegeneration, and pediatric oncology [C.17]. Re-
garding the brain, the influence of MT results in mood improvement, enhanced
cognitive functions in memory, and provides a sense of connection for patients
who may feel alone [L.23]. Moreover, MT is a gradual process as results are not
observed in just a few days. After a few weeks of prolonged exposure, though,
a significant increase in the brain’s alpha waves indicates that a novel form of
art-based therapy is something that participants habituate and adapt to [V.18].
As weeks go by, music-listening groups undergoing MT even experience more
joyous and relaxed emotions [V.18].
Even though there aren’t effective treatments developed for different neu-
rodegenerative diseases such as AD, MT and other music-related non-pharmacological
therapies have garnered more attention as a method to boost cognitive and be-
havioral functions. For one, MT induces plastic changes in a few brain networks;
this is a process known as neuroplasticity [L.23]. Neuroplasticity is a unique pro-
cess that involves adaptive structural and functional alterations to the brain.
This concept is of extreme importance since it allows the brain to reorganize
itself and change its activity in response to distinct stimuli after injuries such
as a stroke.
To summarize, the impact of music, especially classical music, on the brain is
remarkably pervasive. It engages numerous brain areas, triggers the release of a
greater quantity of neurotransmitters, and has the potential to reshape the brain
itself. Given music’s proven effectiveness in improving cognitive functions like
spatial reasoning and memory, it is imperative to explore and research whether

18

223
classical music, as a non-pharmacological intervention for AD patients, could
ameliorate the neuropsychiatric and cognitive symptoms of AD.

4 Alzheimer’s Disease and the Influence of Clas-


sical Music
AD is a formidable and deeply puzzling neurological disease that proceeds to
challenge medical research and potential therapeutic interventions. As health-
care professionals and scientists strive to unravel the complexities of this dis-
order, unconventional and non-pharmacological avenues for exploration have
emanated. In the midst of these, classical music intervention in AD has capti-
vated the interest of many. Not only does this convergence prompt questions
regarding curative advantages, but it also delves into the extreme effect classical
music may have on NPS, cognitive symptoms, and the overall quality of life for
AD patients. In this section, the multifaceted interrelation between AD and
classical music will be discussed, seeking answers to uncover the ways in which
art may offer hope to AD patients.

4.1 Classical Music’s Effect on Neuropsychiatric Mecha-


nisms of Alzheimer’s
To reiterate, NPS are defined as non-cognitive disturbances pertaining to men-
tal health defects and psychiatric disorders. Some of the NPS that were detailed
in this paper include apathy, depression, aggression/agitation, and sleep disor-
ders. In this section, classical music’s effect – especially Mozart’s effect – on the
neuropsychiatric mechanisms of AD will be discussed.
To begin, it is well-known that glimpses of aggression, confusion, and agitation
in elderly individuals with AD are significant problems for both the patients and
their caregivers. The soothing melodies and captivating rhythms have demon-
strated promise in lessening aggressive outbursts of anger and frustration. In
patients with Alzheimer’s Disease and other related dementias (ADRD), cogni-
tive impairment plays a significant role in triggering aggression.

19

224
Figure 12: Gerdner’s Mid-Range Theory of Music Intervention for Agitation
[L.05].

Figure 12 illustrates the concepts underlying Gerdner’s Mid-Range Theory,


which asserts that cognitive decline is key to elevating agitation. The precur-
sor to aggression results in the impaired and decreased ability to perceive and
process sensory information, resulting in a lowered stress threshold and an el-
evated anxiety potential [L.05]. Simply, this indicates that, as AD progresses,
fewer stressors are necessary to meet the stress threshold, resulting in agitation.
When testing this popular theory, one research group evaluated the agitation
levels of AD patients after they listened and actively paid attention to classi-
cal “relaxation” music for almost six weeks. The overall change in agitative
measures, words, and actions was recorded using the Modified Cohen-Mansfield
Agitation Inventory. The results of the study concluded that classical music
intervention supports Gerdner’s music theory since a significant reduction in
agitation occurred, which was far less with other music types [L.05].
Another significant NPS of AD is depression, characterized by a deteriora-
tion in emotional, behavioral, and social functions. When applying an arts-
based intercession such as concert classical music, a wide range of previously
deserted feelings are induced in AD patients who are now capable of battling
the hopelessness and sadness present in depressive states. For instance, clas-
sical compositions such as Chopin’s Nocturnes are linked to improved mood,
thereby reducing depressive symptoms. With just five sessions of classical-music
therapy, depression and behavioral problems greatly lessened in people with
ADRD [dRLSBGCGA22].
As previously addressed, the lack of interest/motivation or apathy is another
challenging condition frequently recorded in AD. Classical music possesses the
ability to stimulate emotional responses by releasing certain neurotransmitters
such as dopamine and serotonin, which naturally rekindles attention, engage-
ment, and a general interest in daily activities. After a 12 week classical music
therapy intervention on apathy, AD patients demonstrated a tremendous im-
provement in not just interest levels but also depression, orientation, anxiety,
and aggression [dRLSBGCGA22].
Finally, sleep disruption is also a common problem among older adults, es-
pecially individuals with AD. Quality sleep and a controlled circadian rhythm
serve important restorative functions and indirectly influence our core body
temperature and even the body’s melatonin and cortisol levels [A.21]. Calming,
tailored classical music improves sleep quality in the elderly population with
ADRD because of reduced stress levels and modulated arousal levels [A.21].
Classical music’s therapeutic effects are robustly attributed to a plethora of
neurological mechanisms in AD. Music can actively engage many brain regions
such as the amygdala and the hippocampus of the limbic system even in the more

20

225
advanced stages of this deadly disease. From alleviating depression, apathy,
and agitation to improving sleep patterns, classical music therapy should be
encouraged and conducted by trained professionals in AD senior homes in order
to offer fragments of comfort in the midst of uncertainty.

4.2 The Cognitive Benefits of Classical Music in Alzheimer’s


AD presents a series of challenges to both neuropsychiatric and cognitive func-
tion, gradually stripping away an individual’s memories and reasoning abilities.
Cognitive functions refer to anything related to the processes of thinking, learn-
ing, problem-solving, etc. While no treatment for AD exists, ongoing research
sheds light into the therapeutic influence of classical music on the cognitive
symptoms of AD, especially through the Mozart effect.
When we recognize a familiar tune or common melody in public, we are quick
to connect a moment from our past to that song due to memories. In cases of
AD, musical memory is usually the last to erode away compared to semantic and
episodic memory. For instance, a 92-year-old woman with dementia was able
to recall the three movements of Beethoven’s Moonlight Sonata ranging from
14-16 minutes in length. This begs to address the discovery that brain regions
associated with musical memory are the last to be disrupted in Alzheimer’s.
There exists a functional neuroanatomical foundation for the vulnerability of
all memories in AD including musical memory. Therefore, musical memory
serves as an informative concept of the neural networks harmed in AD. Patients
with AD perform better on tasks involved in recognition memory and semantic
memory when classical music is accompanied by a spoken recording [A.10]. As
previously mentioned, music processing involves a matrix of neural networks,
recruiting from many brain areas such as the basal ganglia, hypothalamus, and
amygdala [A.10]. By stimulating memory circuits in the brain, classical music
underscores the importance of music therapy for AD patients.
Finally, the deterioration of reasoning skills such as spatial-reasoning is an-
other major cognitive symptom of AD. Spatial reasoning is defined as the ability
to understand and manipulate visual information and stimuli, and it plays a sig-
nificant role in problem solving and other daily tasks. Even though AD presents
hurdles for an AD patient’s reasoning ability, the science behind classical music’s
intricate mechanisms can engage various brain areas involved in reasoning and
comprehension. Just after listening to Mozart’s Sonata for Two Pianos for 10
minutes, AD patients demonstrated significantly better spatial reasoning, which
is a common outcome of the Mozart effect [S.01]. Furthermore, several phys-
iological pathways are activated in response to classical music stimuli, which
modulates body responses like improved reasoning [M.14]. These results, how-
ever, are not demonstrated through the Mozart effect alone. The Schubert effect
is another widespread classical music topic similar to Mozart’s effect, which also
results in a better performance to spatial tasks after a period of time [M.14].
In the realm of AD, where cognitive decline lowers a patient’s quality of
life, classical music emerges as an inspiration to enhance memory and reasoning
skills. Even though this intervention is not pharmacological in nature, the

21

226
symphonies of classical music are proven to provide solace and relief for both
the neuropsychiatric and cognitive mechanisms of AD.

5 Conclusion
Alzheimer’s is one of the extreme medical mysteries in the healthcare field with
unanswered questions regarding its etiology, pathology, and diagnosis. Nonethe-
less, it is clear that various genetic and environmental risk factors such as age,
PSEN mutations, gender, obesity, T2D, and CVD are somewhat responsible for
the progression and development of AD. In terms of AD’s pathology, several
distinctive signs such as AB plaques, NFTs, elevated ROS levels, and oxidative
stress are extremely prominent.
After a thorough examination of the effects of classical music on the pro-
gression of Alzheimer’s, the implementation of classical music therapy in senior
homes and assisted living care centers is of utmost importance. The researched,
observed effects of classical music intervention result in enhanced memory, re-
duced NPS, and redeveloped cognitive abilities. Classical music therapy’s ability
to alleviate NPS such as depression and agitation as well as cognitive symptoms,
including learning defects and reduced spatial reasoning, underscores the value
of music as a non-pharmacological intervention that effectively improves an AD
patient’s quality of life alongside existing treatments.

22

227
References
[2A 18] About alzheimer’s disease: Promoting health and indepen-
dence for an aging population. Centers for Disease Control
and Prevention, 2018.

[3A 15] Amyloid plaques and neurofibrillary tangles. Alzheimer’s


Disease Research at BrightFocus Foundation, 2015.

[702 1] Study reveals how apoee4 gene may increase risk for demen-
tia. National Institute on Aging, 2021.

[A.10] Simmons-Stern N. R. Budson A. E. Ally B. A. Music as a


memory enhancer in patients with alzheimer’s disease. Neu-
ropsychologia, 2010.

[A.13] Mergenthaler P. Lindauer U. Dienel G. A. Meisel A. Sugar


for the brain: the role of glucose in physiological and patho-
logical brain function. Trends in neurosciences, 2013.
[A.17] Pizzino G. Irrera N. Cucinotta M. Pallio G. Mannino F. Ar-
coraci V. Squadrito F. Altavilla D. Bitto A. Oxidative stress:
Harms and benefits for human health. Oxidative medicine
and cellular longevity, 2017.

[A.18] Weller J. Budson A. Current understanding of alzheimer’s


disease diagnosis and treatment. F1000Research, 2018.

[A.20] Lloret M. A. Cervera-Ferri A. Nepomuceno M. Monllor P.


Esteve D. Lloret A. Is sleep disruption a cause or con-
sequence of alzheimer’s disease? reviewing its possible role
as a biomarker. International journal of molecular sciences,
2020.

[A.21] Petrovsky D. V. Ramesh P. McPhillips M. V. Hodgson N.


A. Effects of music interventions on sleep in older adults: A
systematic review. Geriatric nursing, 2021.
[C.06] Kumar-Singh S. Theuns J. Van Broeck B. Pirici D. Ven-
nekens K. Corsmit E. Cruts M. Dermaut B. Wang R.
Van Broeckhoven C. Mean age-of-onset of familial alzheimer
disease caused by presenilin mutations correlates with both
increased aB42 and decreased aB40. Human mutation, 2006.

[C.17] Aalbers S. Fusar-Poli L. Freeman E. R. Spreen M. Ket CF J.


Vink C. A. Maratos A. Crawford M. Chen X. Gold C. Music
therapy for depression. Cochrane Database of Systematic
Reviews, 2017.

23

228
[C.18a] Gonzalez C. Armijo E. Bravo-Alegria J. Becerra-Calixto A.
Mays C. E. Soto C. Modeling amyloid beta and tau pathol-
ogy in human cerebral organoids. Mol Psychiatry, 2018.

[C.18b] Hamlett E. D. Ledreux A. Potter H. Chial H. J. Patterson D.


Espinosa J. M. Bettcher B. M. Granholm A. C. Exosomal
biomarkers in down syndrome and alzheimer’s disease. Free
radical biology medicine, 2018.
[C.19] Demir E. A. Tutuk O. Dogan H. Tumer C. Depression in
alzheimer’s disease: The roles of cholinergic and serotonergic
systems. National Library of Medicine, 2019.

[C.21] Burillo J. Marqués P. Jiménez B. Gonzalez Blanco C. Benito


M. Guillen C. Insulin resistance and diabetes mellitus in
alzheimer’s disease. Cells, 2021.

[D.06] Koelsch S. Fritz T. V Cramon D. Y. Maller K. Friederici A.


D. Investigating emotion with music: an fmri study. Human
brain mapping, 2006.
[D.16] Gordon D. Sounding the alarm on a future epidemic:
Alzheimers disease. UCLA Newsroom, 2016.
[dLMJBRD18] Mosconi L. Rahman A. Diaz I. Wu X. Scheyer O. Hristov H.
W. Vallabhajosula S. Isaacson R. S. de Leon M. J. Brinton
R. D. Increased alzheimer’s risk during the menopause tran-
sition: A 3-year longitudinal brain imaging study. PloS one,
2018.
[dRLSBGCGA22] da Rocha L.A. Siqueira B.F. Grella C.E. Gratao A.C.M. Ef-
fects of concert music on cognitive, physiological, and psy-
chological parameters in the elderly with dementia: a quasi-
experimental study. Dementia neuropsychologia, 2022.

[E.08] Gold C. A. Budson A. E. Memory loss in alzheimer’s disease:


implications for development of therapeutics. Expert review
of neurotherapeutics, 2008.

[E.14] Bottiroli S. Rosi A. Russo R. Vecchi T. Cavallini E. The


cognitive effects of listening to background music on older
adults: processing speed improves with upbeat music, while
memory seems to benefit from both upbeat and downbeat
music. Frontiers in aging neuroscience, 2014.

[E.17] Tonnies E. Trushina E. Oxidative stress synaptic dysfunc-


tion and alzheimer’s disease. Journal of Alzheimer’s disease,
2017.

24

229
[E.18] Morgan K. N. Derby C. A. Gleason C. E. Cognitive changes
with reproductive aging perimenopause and menopause. Ob-
stetrics and gynecology clinics of North America, 2018.

[E.19] Lott I.T. Head E. Dementia in down syndrome: unique in-


sights for alzheimer disease research. Nat Rev Neurol, 2019.

[E.22a] Carotenuto A. Fasanaro A. M. Manzo V. Amenta F. Traini


E. Association between the cholinesterase inhibitor donepezil
and the cholinergic precursor choline alphoscerate in the
treatment of depression in patients with alzheimer’s disease.
Journal of Alzheimer’s disease reports, 2022.

[E.22b] Klein S. Gastaldelli A. Yki-JÃrvinen H. Scherer P. E. Why


does obesity cause diabetes?. Cell metabolism, 2022.

[EI.17] Lukiw WJ. Rogaev EI. Genetics of aggression in alzheimer’s


disease (ad). Front Aging Neuroscience, 2017.

[G.06] Starkstein S. E. Jorge R. Mizrahi R. Robinson R. G. A


prospective longitudinal study of apathy in alzheimer’s dis-
ease. Journal of neurology neurosurgery and psychiatry,
2006.
[G.18] Brzecka A. Leszek J. Ashraf G. M. Ejma M. Avila-Rodriguez
M. F. Yarla N. S. Tarasov V. V. Chubarev V. N. Samsonova
A. N. Barreto G. E. Aliev G. Sleep disorders associated with
alzheimer’s disease a perspective. Frontiers in neuroscience,
2018.
[G.21] Leszek J. Mikhaylenko E. V. Belousov D. M. Koutsouraki E.
Szczechowiak K. Kobusiak-Prokopowicz M. Mysiak A. Diniz
B. S. Somasundaram S. G. Kirkland C. E. Aliev G. The
links between cardiovascular diseases and alzheimer’s disease.
Current neuropharmacology, 2021.
[G.22] Martens Y. A. Zhao N. Liu C. C. Kanekiyo T. Yang A. J.
Goate A. M. Holtzman D. M. Bu G. Apoe cascade hypoth-
esis in the pathogenesis of alzheimer’s disease and related
dementias. Neuron, 2022.

[H.13] Jahn H. Memory loss in alzheimer’s disease. Dialogues in


clinical neuroscience, 2013.
[J.01] Blood A. J. Zatorre R. J. Intensely pleasurable responses to
music correlate with activity in brain regions implicated in
reward and emotion. Proceedings of the National Academy
of Sciences of the United States of America, 2001.

25

230
[J.03] Terry A. V. Buccafusco J. J. The cholinergic hypothesis of
age and alzheimer’s disease-related cognitive deficits: Recent
challenges and their implications for novel drug development.
The Journal of Pharmacology and Experimental Therapeu-
tics, 2003.
[J.14] Li X. De Beuckelaer A. Guo J. Ma F. Xu M. Liu J. The
gray matter volume of the amygdala is correlated with the
perception of melodic intervals: a voxel-based morphometry
study. PloS one, 2014.
[J.15] Guerreiro R. Bras J. The age factor in alzheimer’s disease.
Genome medicine, 2015.
[J.17] Gomez Gallego M. Gomez Garcia J. Music therapy
and alzheimer’s disease: Cognitive, psychological, and be-
havioural effects. Neurology, 2017.
[K.15] Klimova B. Maresova P. Valis M. Hort J. Kuca K.
Alzheimer’s disease and language impairments: social inter-
vention and medical treatment. Clinical interventions in ag-
ing, 2015.
[KH12] Brieger K. Schiavone S. Miller Jr. J.F. Krause KH. Reac-
tive oxygen species: from health to disease. Swiss Medical
Weekly, 2012.
[L.05] Gerdner L. Effects of individualized versus classical relax-
ation music on the frequency of agitation in elderly persons
with alzheimer’s disease and related disorders. Cambridge
University Press, 2005.
[L.18] Scheyer O. Rahman A. Hristov H. Berkowitz C. Isaacson R.
S. Diaz Brinton R. Mosconi L. Female sex and alzheimer’s
risk: The menopause connection. The journal of prevention
of Alzheimer’s disease, 2018.
[L.23] Bleibel M. El Cheikh A. Sadier N. S. Abou-Abbas L. The ef-
fect of music therapy on cognitive functions in patients with
alzheimer’s disease: a systematic review of randomized con-
trolled trials. Alzheimer’s Research and Therapy, 2023.
[M.14] Pauwels E. K. Volterrani D. Mariani G. Kostkiewics M.
Mozart music and medicine. Medical principles and prac-
tice : international journal of the Kuwait University Health
Science Centre, 2014.
[M.16a] Ferreira-Vieira T. H. Guimaraes I. M. Silva F. R. Ribeiro F.
M. Alzheimer’s disease: Targeting the cholinergic system.
Current neuropharmacology, 2016.

26

231
[M.16b] Lubitz I. Ricny J. Atrakchi-Baranes D. Shemesh C. Kravitz
E. Liraz-Zaltsman S. Maksin-Matveev A. Cooper I. Lei-
bowitz A. Uribarri J. Schmeidler J. Cai W. Kristofikova
Z. Ripova D. LeRoith D. Schnaider-Beeri M. High di-
etary advanced glycation end products are associated with
poorer spatial learning and accelerated aB deposition in an
alzheimer mouse model. Aging cell, 2016.

[M.18] Nobis L. Husain M. Apathy in alzheimer’s disease. Current


opinion in behavioral sciences, 2018.

[M.19a] Stegemann T. Geretsegger M. Phan Quoc E. Riedl H.


Smetana M. Music therapy and other music-based inter-
ventions in pediatric health care: An overview. Medicines
Basel Switzerland, 2019.

[M.19b] Tiwari S. Atluri V. Kaushik A. Yndart A. Nair M.


Alzheimer’s disease: pathogenesis, diagnostics, and thera-
peutics. Dovepress, 2019.

[M.20] Gomez W. Morales R. Maracaja-Coutinho V. Parra V. Nas-


sif M. Down syndrome and alzheimer’s disease: common
molecular traits beyond the amyloid precursor protein. Ag-
ing, 2020.
[M.21a] Fortea J. Zaman S. H. Hartley S. Rafii M. S. Head E.
Carmona-Iragui M. Alzheimer’s disease associated with
down syndrome: a genetic form of dementia. The Lancet.
Neurology, 2021.

[M.21b] Soufineyestani M. Khan A. Sufineyestani M. Impacts of mu-


sic intervention on dementia: A review using meta-narrative
method and agenda for future research. Neurology interna-
tional, 2021.
[M.22] Eickhoff S. Franzen L. Korda A. Rogg H. Trulley V. N. Borg-
wardt S. Avram M. The basal forebrain cholinergic nuclei
and their relevance to schizophrenia and other psychotic dis-
orders. Frontiers in psychiatry, 2022.

[ML14] Barbagallo M. and Dominquez J. L. Type 2 diabetes mellitus


and alzheimer’s disease. Baishideng Publishing Group, 2014.

[MM21] Geula C. Dunlop S. R. Kawles A. S. Flanagan M. E. Gefen


T. Mesulam M.-M. Basal forebrain cholinergic system in the
dementias: Vulnerability, resilience, and resistance. Journal
of Neurochemistry, 2021.

27

232
[O.14] Gouras G. K. Olsson T. T. Hansson O. AB-amyloid peptides
and amyloid plaques in alzheimer’s disease. Neurotherapeu-
tics, 2014.

[O.18] Gulisano W. Maugeri D. Baltrons M. A. FÃ M. Amato A.


Palmeri A. D’Adamio L. Grassi C. Devanand D. P. Honig L.
S. Puzzo D. Arancio O. Role of amyloid-B and tau proteins
in alzheimer’s disease: Confuting the amyloid cascade. 2018.
[P.19] Fernandez N. B. Trost W. J. Vuilleumier P. Brain networks
mediating the influence of background music on selective at-
tention. Social cognitive and affective neuroscience, 2019.

[R.97] Blacker D. Haines J. L. Rodes L. Terwedow H. Go R. C. Har-


rell L. E. Perry R. T. Bassett S. S. Chase G. Meyers D. Al-
bert M. S. Tanzi R. Apoe-e4 and age at onset of alzheimer’s
disease. American Academy of Neurology Journal, 1997.
[R.12] Kopan R. Notch signaling. Cold Spring Harbor perspectives
in biology, 2012.
[R.18] KuÄikienÄ— D. PraninskienÄ— R. The impact of music
on the bioelectrical oscillations of the brain. Acta medica
Lituanica, 2018.

[R.20] Breijyeh Z. Karaman R. Comprehensive review on


alzheimer’s disease: Causes and treatment. Molecules, 2020.

[R.22] Mangalmurti A. Lukens J. R. How neurons die in alzheimer’s


disease: Implications for neuroinflammation. Current opin-
ion in neurobiology, 2022.

[rSJ17] Kelleher R. J. 3rd Shen J. Presenilin-1 mutations and


alzheimer’s disease. Proceedings of the National Academy
of Sciences of the United States of America, 2017.

[S.01] Jenkins J. S. The mozart effect. Journal of the Royal Society


of Medicine, 2001.

[S.11] Lyketsos C. G. Carrillo M. C. Ryan J. M. Khachaturian


A. S. Trzepacz P. Amatniek J. Cedarbaum J. Brashear R.
Miller D. S. Neuropsychiatric symptoms in alzheimer’s dis-
ease. Alzheimer’s dementia : the journal of the Alzheimer’s
Association, 2011.
[S.12] De Strooper B. Iwatsubo T. Wolfe M. S. Presenilins and Y-
secretase: structure, function, and role in alzheimer disease.
Cold Spring Harbor perspectives in medicine, 2012.

28

233
[S.16] Love S. Miners J. S. Cerebrovascular disease in aging and
alzheimer’s disease. Acta neuropathologica, 2016.

[S.18] Hampel H. Mesulam M. M. Cuello A. C. Farlow M. R. Gi-


acobini E. Grossberg G. T. Khachaturian A. S. Vergallo A.
Cavedo E. Snyder P. J. Khachaturian Z. S. The cholinergic
system in the pathophysiology and treatment of alzheimer’s
disease. Brain : a journal of neurology, 2018.
[S.19a] Dulawa S. C. Janowsky D. S. Cholinergic regulation of
mood: from basic and clinical studies to emerging therapeu-
tics. Molecular psychiatry, 2019.

[S.19b] Yu R. Topiwala A. Jacoby R. Fazel S. Aggressive behaviors


in alzheimer disease and mild cognitive impairment: Sys-
tematic review and meta-analysis. The American journal of
geriatric psychiatry : official journal of the American Asso-
ciation for Geriatric Psychiatry, 2019.

[T.17] Sarkamo T. Cognitive, emotional, and neural benefits of


musical leisure activities in aging and neurological rehabili-
tation: A critical review. Science Direct, 2017.

[T.21] Serrano-Pozo A. Das S. Hyman B. T. Apoe and alzheimer’s


disease: advances in genetics pathophysiology and therapeu-
tic approaches. The Lancet. Neurology, 2021.

[V.18] Nawaz R. Nisar H. Voon Y. V. The effect of music on hu-


man brain; frequency domain and time series analysis using
electroencephalogram. IEEE Xplore, 2018.

[X.19] Ding Y. Zhang Y. Zhou W. Lin Z. Huang J. Hong B. Wang


X. Neural correlates of music listening and recall in the hu-
man brain. The Journal of neuroscience : the official journal
of the Society for Neuroscience, 2019.
[Z.13] Gu L. Guo Z. Alzheimer’s aB42 and aB40 peptides form
interlaced amyloid fibrils. Journal of neurochemistry, 2013.

29

234
The Effects of Classical Music Intervention on the
Neuropsychiatric and Cognitive Mechanisms of
Alzheimer’s Disease Patients

Nyneishia Janarthanan
October 17, 2023

Abstract
Alzheimer’s Disease (AD) is a progressive neurodegenerative disorder,
presenting a profound challenge to both neuropsychiatric and cognitive
well-being. As the sixth leading cause of death in the United States, AD
currently lacks a cure. This concurrent drawback sheds light into both the
pharmacological and nonpharmacological, therapeutic interventions that
could be incorporated into an AD patient’s course of treatment. Among
these is the transformative promise of classical music as a nonpharma-
cological mediation for AD patients. The exploration between classical
music and the neuropsychiatric and cognitive mechanisms of AD unveils
the effects of classical music on memory, spatial reasoning, depression,
sleep disorders, and other AD symptoms. Concepts such as Mozart’s ef-
fect offer a source of solace for improving the quality of life of individuals
diagnosed with AD. Moreover, the activation of the brain and the alter-
ation in various brain structures give rise to the diverse effects of classical
music in a healthcare and neurological setting.

∗ Advised by: Professor Arij Daou of the University of Chicago

235
1 Introduction
The implementation of classical music into human existence traces its origins
to the middle of the 18th century, serving not only as an art form but also
as a wordless language of its own. Nonetheless, classical music’s purpose has
evolved over the years in a plethora of ways: music as a pain reliever, the
beneficial role of music in exercise and sport, musical leisure activities in aging
rehabilitation, etc. However, one field that often remains overlooked in this
regard is the effect of classical music on the symptoms of Alzheimer’s Disease
(AD). Even though neuroscience remains to be a highly studied and researched
field, a plethora of questions with unknown answers arise from the topic of
music’s effect on the brain and limbic system of an AD patient. Alzheimer’s –
a neurological disease resulting from neuronal degeneration – ranks among the
leading causes of death worldwide. Its hallmark symptoms include memory loss,
cognitive decline, disorientation, aggression, depression, and a common inability
to perform everyday tasks. The likelihood of acquiring this harmful disorder is
steadily increasing and is expected to worsen tremendously in the future.

Figure 1: Worldwide Projections of AD Prevalence, 2005-2050 [D.16].

In 2020, over 40 million individuals worldwide and nearly 6 million Ameri-


cans contracted AD [D.16]. These staggering figures demonstrate not just the
national, per-country threat, but also the global importance and distressing
impact of this disease [C.18a]. Offering a glimmer of hope amid the bleak
landscape of cognitive decline, research and the associated body of knowledge
indicates that classical music acts as a medium that transcends the memory loss
associated with AD. In fact, classical music therapy functions as a treatment
modality by improving learning, communication, mobility, and other mental
and physical functions [J.17]. The aforementioned skills are all severely affected
along the course of a patient with AD or other forms of dementia. This research
paper aims to highlight the overall advantages and benefits derived from the
application of classical music interventions to improve an AD patient’s quality
of life. Moreover, such a healthy inclusion possesses the potential to generate
viable results during the pathogenesis and treatment of AD.

236
2 Alzheimer’s Disease
Alzheimer’s Disease (AD) – currently ranked as the sixth leading cause of death
worldwide – represents a progressive neurodegenerative disorder. It lacks a
definitive cure and primarily impacts a patient’s cognitive functions such as
memory, behavior, and thinking. AD stands as the most prevalent form of
dementia, which is characterized by a gradual decline in two or more domains
of cognition such as memory, language, behavior, and executive function [A.18].
The presence of neuritic plaques and neurofibrillary tangles are AD’s hallmark
indications – elements measured throughout the progression and regression of
any disease [R.20]. This condition was first comprehensively described by Alois
Alzheimer in 1906 as a “peculiar severe disease process of the cerebral cortex”.
Healthcare costs for AD are estimated to be approximately $500 billion yearly,
ranging from the necessity for treatments to routine checkups.

2.1 Etiology of Alzheimer’s Disease


The etiological pathway – or the set of common causes for AD – divides into
both genetic and environmental factors. The predominant set of genetic risk
components of AD that will be discussed include age, the presenilin mutation,
Down Syndrome or Trisomy 21, and gender. A genetic factor is defined as
a component that increases the likelihood of developing a particular disease
depending on an individual’s genetic makeup. Unquestionably, the greatest
genetic risk factor for AD is advanced age, typically after the age of 65 [J.15].

Figure 2: Projected Number of People Aged 65 or Older With Late-Onset


Alzheimer’s Disease, by Age Group, US, 2010-2050 [A18A 182A 18 ].

Late-onset or sporadic Alzheimer’s is the most common type of AD; signs


begin to appear promptly after a person’s mid-60s. Figure 2 demonstrates the
prevalence of contracting late-onset AD after the age of 65. This phenomenon
is relatively frequent and is estimated to increase in the mere future. On the
other hand, it is important to note that early-onset or familial AD is relatively
rare and is usually caused by gene changes passed down from a parent to their
child. Signs first appear between an individual’s 30s and mid-60s. In familial

237
AD, nearly half of the cases are due to mutations in three genes: amyloid pre-
cursor protein (APP), Presenilin-1 (PSEN1) and Presenilin-2 (PSEN2) [J.15].
Presenilin (PSEN) mutations will be discussed in much more detail later. It is
crucial to understand that findings in early-onset familial cases will translate to
the sporadic (no specific family link) late-onset AD.
Aging is the main risk factor for AD that simply cannot be explained by
the popular amyloid hypothesis theory, which asserts that the amyloid-beta
plaques are the major highlight of this disease. Yet, an alternate perspective
to aging and AD is strongly related to the APOEE4 allele, which remains as
the most robust genetic risk factor for sporadic AD. In AD, the risk conferred
by APOEE4 is mostly observed in the 61-65 age group, which supports the
statement that symptoms of late-onset AD first appear around 65 years of age
[R.97]. One copy of APOEE4 is carried by approximately 25% of individuals,
but inheriting this gene does not indicate that a person will surely develop AD
[212 1702 1 ]. It is important to note that the APOEE4, APOEE3, and APOEE2
alleles all play a significant role in the onset and progression of AD, but the
APOEE4 vastly increases the risk of the disease compared to its counterparts
[G.22]. Moreover, the APOE genotypes’ pathogenesis has been researched way
beyond just amyloid-beta plaques and the Tau neurofibrillary tangles, providing
potential answers to the age-related progression of AD [T.21]. For one, APOEE4
is associated with not only AD but also other symptoms and diseases such as
age-related cognitive decline and Lewy Body Dementia (LBD) [G.22]. Secondly,
the APOE Cascade Hypothesis connects the dots between an increased risk of
AD and aging by stating that the biochemical and biophysical characteristics
of APOEE4 at a cellular level cause a multitude of downstream effects observed
in AD [T.21].

Figure 3: APOEE4 Cascade Hypothesis Demonstrated Through 4 Phases [G.22].

238
The cascade – or the successive progression of APOEE4 – begins at the
biochemical and cellular phase as demonstrated in Figure 3. Properties of the
allele such as lipidation and receptor binding have harmful impacts on some cell
processes, which could accumulate into cellular stress and eventually lead to the
onset of age-related cognitive decline and AD. Aging and AD are interconnected
but are distinct in nature. The number of neurons do not severely increase or
decrease in aging, but neuronal and synapse loss is a key indication of AD.
Nevertheless, aging and the increased risk of contracting AD with the APOEE4
gene is a predominantly researched topic within the field, proving to hold a
major connection to sporadic AD.
Another genetic factor for AD that is widely discussed is the PSEN1 gene
mutation, encoding the Presenilin-1 (PS1) protein. In early-onset AD or famil-
ial Alzheimer’s Disease (FAD), PSEN1 mutations account for nearly 90% of all
mutations recorded in FAD, illustrating the significance of this gene and its pro-
tein products [rSJ17]. The presenilin hypothesis proposes that these deleterious
mutations result in a decrease of the needed presenilin functions in the brain,
triggering both neurodegeneration and dementia in FAD [rSJ17]. Another com-
ponent to discuss – with regard to PSEN1 and PS1 – is the Y-secretase enzyme
whose catalytic subunit is PS1. More specifically, Y-secretase cleaves differ-
ent types of transmembrane proteins attached to the plasma membrane of a
cell, which includes the amyloid precursor protein (APP) – a central element
of AD. Y-Secretases produce two types of amyloid-beta proteins in AD: AB42
and AB40. The only difference between the two is that AB42 has two extra
residues at its C-terminus end [Z.13]. It has been proposed in the past that
AB42 and AB40 are heavily responsible for AD since they accumulate into one
of the hallmark pathological indications for this disease – AB plaques. However,
PSEN1 mutations did not increase both of the proteins; instead, the proteins
both decreased in number (especially AB40) which elevated the AB42/AB40
ratio [rSJ17]. This AB42/AB40 ratio is a useful diagnostic marker of AD since
the ways in which PSEN1 mutations affect APP and AB-plaques is complex
and not yet properly acknowledged [C.06]. It is clear that Y-secretases produce
the final AB proteins involved in AD, but they also regulate Notch signaling,
which regulates cell proliferation, cell fate, differentiation, and cell death [R.12].
Therefore, pharmacological interventions such as drug therapy attempt to alter
AB protein production without interfering with Y-Secretases’ ability to perform
Notch signaling [S.12].
The next genetic factor for AD that will be discussed is Down Syndrome
(DS) or Trisomy 21: a genetic disorder caused by the presence of an extra
copy of Chromosome 21 or a part of it. It is distinguished based on cran-
iofacial abnormalities, heart defects, cognitive impairments, and neurological
alterations [M.20]. With over 200,000 cases in the United States alone, DS
is one of the leading genetic risk factors for FAD. Furthermore, clinical and
biomarker changes in DS associated FAD demonstrate that many of the same
cortical regions are affected in both diseases such as the hippocampus and the
prefrontal cortex [M.21a]. As they progress, both diseases share similar cellular
dysfunctions such as impaired autophagy, reduced and/or damaged lysosomal

239
activity, and mitochondrial dysfunction [M.20].

Figure 4: Pathological Indications of DS Common in AD [E.19].

By the age of 40, NFT and AB accumulation are present in the brains
of individuals with DS, which is sufficient enough to confirm a pathological
diagnosis for AD [C.18b]. In Figure 4, evidence demonstrates that progres-
sive brain inflammation can emerge as early as the late teenage years in DS
based on recorded intracellular accumulations of AB. The early appearance
of AD’s hallmark indications in individuals with DS can be explained by the
presence of neuron-derived exosomes, which are tiny extracellular vesicles that
contain elevated levels of both AB peptides and the hyperphosphorylated Tau
protein [C.18b]. Since exosomes are blood biomarkers, their progression and de-
velopment can be monitored, which informs future AD diagnostics, preventions,
and potential treatments in the DS population as well as the general population.
Finally, sex is an important genetic risk factor for contracting AD, with al-
most two thirds of the late-onset AD population being women [dLMJBRD18].
It cannot simply be stated that women are more likely to develop AD since they
have a greater life longevity compared to men. This is because AD pathology
starts many years prior to the appearance of most clinical symptoms [L.18].
However, there is increasing evidence that the perimenopause to menopause
transition (PTMT) – a midlife neuroendocrine transition specific to women – is
heavily responsible for the sex-observed pathophysiological mechanisms underly-
ing AD [dLMJBRD18]. PTMT is strongly neurological in nature; it disrupts and
alters the systems and mechanisms regulating estrogen and impacts thermoreg-
ulation, circadian rhythm, sleep, depression, and even cognition [dLMJBRD18].
During PTMT, estrogen, progesterone, pituitary, hypothalamic, and ovarian
hormone levels fluctuate and decrease. Estrogen, specifically, is unique to fe-
males and is found in a plethora of areas in the brain controlling memory and
cognitive function, indicating its neurological significance [E.18]. When the
brain’s estrogen network disconnects from other brain areas, the resulting hy-
pometabolic state serves as a major site for neurological dysfunction [L.18].
In fact, perimenopausal (PERI) and postmenopausal (MENO) women show
major declines in estrogen-dependent memory tests compared to men, which

240
is the first indication that PTMT can trigger cognitive decline in the female
population [dLMJBRD18]. Secondly, the MENO and PERI groups disclosed
higher rates of cerebral metabolic rate for glucose consumption (CMRglc) de-
cline compared to males and premenopausal (PRE) women. Glucose is neces-
sary to provide the precursors for neurotransmitter synthesis and fuel adenosine
triphosphate (ATP) production, which is the source of energy and storage at the
cellular level [A.13]. With a noticeable decrease in glucose levels, the neurologi-
cal workings of the body are severely disrupted in PERI and MENO individuals.
In essence, decreased estrogen levels and the deterioration of the pathway that
affects CMRglc explain the higher percentage of women developing AD.
In addition to genetic risk factors, various environmental, predisposing con-
tributors pertain to AD such as Type 2 diabetes (T2D)/Type 2 diabetes mellitus
(T2DM), obesity, and cerebrovascular disease.
Firstly, the interplay between diabetes, obesity, and AD highlights the com-
plex relationship between lifestyle factors and the risk of cognitive decline. It
should be noted that obesity is characterized by an excessive accumulation of
body fat, which is measured using the Body Mass Index (BMI). Obesity can,
in turn, trigger the development of T2D, and the risk of acquiring this disease
linearly grows with an increase in BMI [E.22b]. T2D – representing 90-95% of
diabetic cases – can be defined as a disease affecting metabolic activity, char-
acterized by the presence of chronic hyperglycemia due to pancreatic cell fail-
ure [C.21]. Just by itself, hyperglycemia or high blood glucose, can contribute
to molecular, biochemical, and histopathological lesions in AD [ML14]. Yet, the
main focus when researching the connection between T2D and AD is insulin re-
sistance – the body’s reluctance to the insulin hormone, subsequently resulting
in an increase of blood sugar. The hyperglycemic status of T2D patients due
to insulin resistance affects neuronal homeostasis and affects K-ATP channels,
which increases AB peptide levels [C.21]. Also, an increased level of glucose
in the blood and the dysregulation of glucose molecules drives an unregulated
non-enzymatic reaction between many carbohydrates (such as sugars) and lipids
and between free amino groups (-NH2) of several proteins and nucleic acids,
which results in advanced glycation end-products (AGEs) [C.21]. High levels of
AGEs elicit inflammatory reactions in the brain and develop symptoms leading
to poorer memory and higher hippocampal levels of insoluble AB42 [M.16b].
AGEs promote AB plaques and neurofibrillary tangle formation more in AD
patients with T2D than in non-diabetic AD patients [C.21]. The two main
hallmark indications of AD – AB plaques and neurofibrillary tangles – will be
discussed in detail in the pathology section.
Another environmental risk factor for AD is cerebrovascular disease (CVD)
– a type of cardiovascular disease that harms the blood vessels supplying the
brain. CVD is the most frequent type of life-threatening injury to the brain
and is the fifth most common cause of death. CVD and AD share many of
the same risk factors such as the APOEE4 gene, T2DM, obesity, and age [S.16].
These account for some of the genetic and environmental risk factors of AD pre-
viously discussed, which demonstrates that CVD’s origin is pathologically and
environmentally similar to that of AD. AB plaques in AD accumulate in the

241
extracellular part of a neuron; in cerebral arterioles and blood vessels supplying
to the brain, AB builds up in the capillaries of CVD patients. Most AD patients
have AB angiopathy resulting from CVD, which predominantly affects the cere-
bral leptomeninges, cortex, cerebellum, and the brain stem [S.16]. The capillary
AB angiopathy is detected in almost 35-45% of AD cases, which provides robust
evidence supporting the hypothesis that CVD can contribute to the symptoms
distinct to AD due to the synergistic relationship of the diseases [G.21].

2.2 Pathology of Alzheimer’s Disease


The principal pathological indications of AD include the presence of amyloid
beta (AB) plaques, neurofibrillary tangles containing an aggregation of the Tau
protein, neuroinflammation, and oxidative stress [S.18]. The entirety of neurode-
generative diseases involves the eventual degradation of neurons in the brain,
and this trend is especially evident in the progression of AD. The exact path
in which neuronal death occurs is obscure, yet there are many theories driven
around the same question. Apoptosis, the process of programmed cell death
to eliminate unwanted cells, is the most extensively studied topic regarding
neuronal loss in AD due to its unique cellular nature. Yet, the AB peptide
known to be responsible for driving neuronal apoptosis is not recorded in many
post-mortem tissue specimens of AD patients [R.22]. Even though there are
other researched explanations for neuronal death such as necrosis, necroptosis,
and/or pyroptosis, the pathological mechanisms underlying neuronal death and
dysfunction in AD continue to elude full comprehension. Nevertheless, the loss
of neurons undoubtedly constitutes the basis of progression for this disease.
Regarding neuronal loss, there has been an elevated degree of focus on cholin-
ergic neurons that release the neurotransmitter acetylcholine (ACh), which has
a crucial role in both the peripheral and central nervous system [M.16a].

Figure 5: The Cholinergic Hypothesis & Release of ACh [J.03]

The alterations in choline uptake, negatively affected ACh release, deficits


in the nicotinic (nAChR) and muscarinic (M2AChR and M1AChR) receptors
and their functions, and deficits in transport through the axon are exhibited
in the early AD neuron above in Figure 5. The decrease in the number of

242
symbols and the reduced color intensity in the legend illustrate that many pro-
teins, enzymes, and essential cell structures in AD turn defective or are missing
all together, which essentially kills the whole neuron. Nearly all brain regions
are innervated by cholinergic neurons, which are responsible for processes re-
lated to learning, memory, and attention. The progressive degeneration of the
basal forebrain cholinergic neurons (BFCNs), for instance, is correlated with the
harmful symptoms and memory deficits that AD impedes on a patient [M.16a].
In fact, BFCNs provide the main cholinergic information to prefrontal cortices
in the brain along with other crucial structures such as the amygdala (respon-
sible for evoking emotions) and the hippocampus, which plays a significant role
in long term memory function and memory consolidation [M.22]. Patholog-
ically, there have been major depletions of the cholinergic synthetic enzyme
named choline acetyltransferase (ChAT) and the cholinergic hydrolytic enzyme
acetylcholinesterase (AChE) in and around BFCNs [MM21]. ChAT is intercon-
nected with the process of synthesizing or polymerizing ACh, whereas AChE
breaks down ACh into its component parts. Together, these enzymes along
with the ACh neurotransmitter play an essential role in the nervous system
due to their ability to regulate cell signaling and host effective communication
amongst neighboring neurons. A dramatic loss of ChAT and AChE activity in a
considerable number of AD cases strongly supports the claim that the BFCN’s
degeneration is a strong foundation for the cholinergic theory of AD.
Onto the more prominent hallmarks of AD (ones that contribute to the loss of
neurons and their synapses) are the extracellular neuritic plaques containing the
amyloid-beta (AB) protein and the intracellular neurofibrillary tangles (NFTs)
carrying the hyperphosphorylated Tau protein.

Figure 6: AB plaques and NFTs in Healthy vs. Affected Neuronal Cavity


[A15A 153A 15 ]

In a healthy neuron, there are visibly almost no AB plaques in the exterior of


the cell, and there are no NFTs present either. This allows for the cell to effec-
tively communicate with surrounding neurons and also operate intracellularly.
On the other hand, the neurons in a brain with Alzheimer’s contain AB plaques
surrounding the cell, and NFTs present in the soma/cell body of the neuron.

243
As a result, neuronal function is disrupted, which eventually contributes to the
cell’s death.

Figure 7: A schema of amyloid precursor protein (APP) cleaved to form AB


plaques [O.14]

It is important to note that AB is a regular peptide produced in the body, but


AB plaques – specifically – indicate a neuropathological hallmark of AD [O.14].
The precursor protein (APP) – located on a cell’s plasma membrane – under-
goes cleavage by B-secretases and y-secretases to produce insoluble AB fibrils.
All mutations in APP linked to AD are within the AB peptide, B-cleavage,
or y-cleavage sites. As seen in Figure 7, APP is cleaved by the B secretase
to form C-terminal fragment B (B - CTF) and then cleaved once more by y-
secretase to produce AB – the prime component of the AB plaques seen in
Alzheimer’s. In the non-amyloidogenic pathway or the way in which AB for-
mation is considered less toxic than the amyloidogenic pathway, the a-secretase
creates less-aggregated forms of AB that are less likely to form extracellular
plaques. The major difference is where the cleavage occurs and which enzyme
carries out the task that indicates whether the AB protein is toxic or non-toxic.
In AD, toxic AB fibrils diffuse into the pre-synaptic clefts and interfere with
the necessary cell signaling [M.19b]. As a result, the AB fibrils that cannot dis-
solve in water aggregate into the plaques that are present in AD. Furthermore,
the polymerization of the AB protein into long strands of polypeptides leads to
the activation of a group of enzymes named kinases, which hyperphosphorylate
the microtubule-associated Tau protein. The gradual build-up of Tau eventually
leads to the formation of intracellular NFTs that disrupt neural communication.
The correlation between AB plaques and the aggregation of Tau has dominated
AD research as the “Amyloid Cascade Hypothesis”. This states that the abnor-
mal build-up of AB is the primary event in AD, triggering Tau pathology and
subsequently neuronal death [O.18].
Another consequence of the accumulation of AB plaques and the Tau associ-
ated NFTs is the increased production of reactive oxygen species (ROS). Since
oxygen is an extremely electronegative element, it readily accepts free flow-
ing electrons generated by mitochondrial oxidative metabolism, which produces

10

244
ROS. These species range from various types of anions to hydrogen peroxide,
which both have unpaired valence electrons. Such ions and compounds will
readily accept an electron to transition into a stable state with a full octet sur-
rounding their outer shell. However, until they reach stability, ROS at high con-
centrations will react almost immediately with the four major macromolecules
in the body: lipids, proteins, carbohydrates, and nucleic acids [KH12]. On one
hand, ROS are crucial in physiological processes such as redox regulation and
transcription of DNA. Nonetheless, they may also induce undesirable effects and
even irreversible outcomes such as the aggregation of Alzheimer’s.

Figure 8: Levels of ROS and the Resulting Effect [KH12]

When ROS levels substantially decrease or increase past their optimal level
of appearance, there can be dangerous consequences such as lack of signaling
when they increase or overshoot signaling (signal is sent exceeding its target)
when they decrease. In the context of AD, increased production of ROS by the
interrelation between AB plaques and NFTs raises the risk of oxidative stress:
a condition caused by the imbalance of ROS in cells and tissues, impairing the
body’s ability to detoxify these reactive substances [A.17]. Elevated levels of
ROS leading to oxidative stress can exacerbate age and disease-dependent mi-
tochondrial dysfunction, reduce antioxidant defences around synaptic activity,
and disrupt neuronal cell signaling, ultimately leading to cognitive dysfunc-
tion [E.17]. Even in the absence of ROS, AB plaques and high levels of Tau can
independently worsen mitochondrial dysfunction and interrupt cell communica-
tion, which inevitably contributes to a perilous cycle in the body with disruptive
homeostatic control.
The intricate pathology of AD, in summary, is marked by hallmark features
of neuronal and synaptic loss, AB plaques, NFTs, ROS, and oxidative stress.

11

245
The interconnectedness between these factors wholly contributes to the cognitive
and behavioral decline associated with AD. Ongoing research continues to shed
light on novel aspects of AD pathology, which holds a degree of promise for
lessening the impact of this neurological disorder.

2.3 Neuropsychiatric and Cognitive Symptoms of AD


As Alzheimer’s progresses, the burden of both neuropsychiatric and cognitive
symptoms can pose significant challenges for both AD patients and caregivers.
Neuropsychiatric symptoms (NPS) are non-cognitive disturbances that pertain
to mental health defects and psychiatric disorders. NPS are interconnected
to both neurological (brain-related) and psychiatric (mental-health related) as-
pects. On the other hand, cognitive symptoms refer to anything harmed in
the brain processes involving thinking, learning, problem-solving, understand-
ing, and decision-making. Memory, attention, language, executive functions,
perception, and reasoning are just a few concepts that fall under the umbrella
of cognitive processes. The most common forms of NPS in AD are apathy,
depression, aggression/agitation, and sleep disorders.
To begin, apathy is the most persistent and frequent NPS recorded in all
the AD stages [S.11]. This symptom is characterized by a lack of motivation for
goal-directed actions and cognitive activity [G.06]. In AD, the primary psychi-
atric correlate of apathy is depression, but depression is neither necessary nor
sufficient to produce apathy; 94% of AD patients with some form of depression
had no apathy [G.06]. When accounting for other NPS, apathy is difficult to
isolate from the various symptoms of dementia, but there are a few neuroimag-
ing techniques that could generate some answers. Many studies hint towards
the involvement of prefrontal dysfunction and deficits within frontostriatal cir-
cuits [M.18]. Areas within these circuits such as the anterior cingulate cortex
(ACC), prefrontal cortex (PFC), and sections of the basal ganglia play a pivotal
role in the progression of apathy in AD [M.18].
Depression trails behind apathy as the second most commonly encountered
NPS in AD patients. Depression can be identified as an affective disorder con-
taining bits and pieces of despair, apathy, insomnia, unwillingness, anxiety,
incompetence, fear, gloominess, and sadness [C.19]. The additional stress im-
plemented by depression in the AD population can detrimentally decrease a pa-
tient’s quality of life. As a matter of fact, depression occurs in an astonishingly
substantial percentage of almost 20-30% of AD patients. Women are expected
to experience a much higher depression rate (almost two times more than men)
when contracting AD, indicating that gender plays a role in the depressive state
of a patient [C.19]. It is important to note that cognitive dysfunction may seem
like depression because the two are quite commonly confused with each other.
Even though it is difficult to capture a definitive diagnosis for depression in AD,
both disorders influence each other through several, overlapping pathological
features. For instance, aberrations in the cholinergic transmission are present
in both AD and depression, which fuses together a pathophysiological intersec-
tion [C.19]. As previously discussed, the cholinergic hypothesis of AD proposes

12

246
that the deterioration of cholinergic neurons – that release the AcH neuro-
transmitter – is primarily responsible for memory loss and learning deficits. In
addition, the ability of the cholinergic system in triggering depression has also
been suggested in clinical studies nearly 50 years ago [S.19a]. With respect
to changes in the cholinergic system, the hippocampal region is the crossroad
where cognitive deficits meet with depressive manners [E.22a].

Figure 9: Cholinergic Alterations in Depression and AD [C.19].

Decreases in the cholinergic innervations of the brain reduce hippocampal


neurogenesis and function, which can result in depression. The chances of this
occurring are much higher for an AD patient compared to the general population
though.
Aggression/agitation in AD often stems from communication difficulties,
which can be properly addressed with appropriate intercessions. To begin, ag-
gression exists in a wide range of forms such as defensive (fear-induced), preda-
tory, dominance, inter-male, maternal, isolation-induced, irritability-associated,
etc. [EI.17]. In order to alleviate these types of symptoms in an ethical man-
ner, medication use and physical restraint are advised by doctors and various
healthcare providers. However, antipsychotic medications are associated with
adverse side effects such as an increased stroke risk, and physically constrict-
ing an AD patient induces negative psychological and physical effects [S.19b].
Even though the molecular mechanisms in functional and pathological agita-
tion in AD remain incompletely understood, promising scientific initiatives are
frequently conducted. One past example includes the definitive explanation of
the comprehensive spectrum of brain abundant neurotransmitters that appear
to have both triggering and preventing impacts on aggressiveness [EI.17].
Lastly, sleep disorders are typical symptoms of AD that appear early on in
the disease, which is quite distinct from other symptoms such as depression and
apathy that appear later on. There is a wealth of evidence available to support
the claim that sleep quality and duration are crucial to consolidate memory
for future retrieval and to remove the build-up of AB and hyperphosphorylated
Tau in AD patients’ brains [A.20]. Individuals with AD that struggle with a
normal sleep schedule exhibited a major alteration in the sleep/wake cycle, with
a growth in the number of nighttime awakenings and an increase in disturbances

13

247
of nocturnal sleep [A.20]. Sleep disorders such as sleep breathing disorders
and restless leg syndrome negatively alter circadian fluctuations of AB in the
interstitial brain fluid and cerebrovascular fluid (CVF) related to the production
of AB plaques [G.18]. Such sleep abnormalities evoke the increased production
of the pathological Tau protein and AB plaques. In order to aid AD patients
in adopting healthy sleep patterns, the development of specific procedures to
improve sleep structure and quality are being expanded on constantly. There
are a plethora of NPS associated with AD, and there is extensive, favorable
research being done on alleviating these symptoms.
Conversely, the major cognitive symptom in Alzheimer’s is memory loss.
If asked to name a disease that affects memory, most doctors would probably
choose Alzheimer’s. The six major memory systems include episodic, seman-
tic, simple classical, procedural, working, and priming memory. Of these ma-
jor categorizations, the deterioration of episodic memory is the most clinically
abundant cognitive symptom in AD [E.08]. Episodic memory is utilized when
consciously recalling a particular episode in one’s life, such as watching a movie
with a family member. Dangers arise from a loss in episodic memory when
AD patients forget if certain medications have been taken or even if the stove
is turned off or not [E.08]. Working memory and long-term, explicit memory
are impacted early in the course of this disease [H.13]. The first brain lesions
unique to AD appear in the poorly myelinated limbic neurons in areas affecting
memory, such as the hippocampus. For instance, hippocampal volume reduces
from 2.5 mL to almost 1.6 mL in the brain of an AD patient, especially as the
disease progresses [H.13]. Even with the inclusion of various AD criteria and
hippocampal biomarkers, there remain several barriers in neurological testing
for memory loss in AD.
Other cognitive symptoms of AD are impaired problem solving and language
levels. Something as simple as following a recipe or even paying the bills can
occur as AD worsens. Language impairments are caused by a decrease of soci-
olinguistic aspects such as the meaning of words, difficulties with fitting a word
and phrase into a situation, and word comprehension [K.15]. These two preva-
lent cognitive symptoms appear early on in AD, so they are used occasionally
to help diagnose a patient with AD.

3 Music and the Brain


The exploration of non-pharmacological interventions for AD has garnered sig-
nificant attention in recent times; however, promising avenues have been un-
deremphasized or overlooked. Cognitive decline, NPS, and memory impairment
present a pressing necessity for a quick and efficient way to mitigate AD’s symp-
toms and risks. Amidst this scholarly discussion, the interplay between music
and AD has emerged as a potential candidate to enhance the quality of life of
an Alzheimer’s patient. This section solely focuses on the relationship between
music and the brain.

14

248
3.1 Observed Effects of Music on the Brain
The ability for humans to perceive and enjoy music is a universal trait that
originated centuries ago and is still carried with us. Music is one of the most
powerful and diverse sensory, cognitive, and emotional experiences [T.17]. Our
brains light up when interpreting and perceiving music, and our bodies respond
in several ways, reflecting the powerful connection between music and our well-
being. Not only does music reduce feelings of separation and loneliness, it
also evokes cherished memories and maintains self-esteem, competence, and
independence [T.17]. From a cognitive standpoint, music boosts communicative
abilities, memory, self and environmental presence, and verbal and non-verbal
expressions [M.21b]. The improved projection of all these skills originates from
the organ responsible for formulating the very essence of what a human being
is: the brain. On top of encoding music, various parts of the brain are also
engaged based on the type of music traveling from the auditory cortex to the
brain’s nerve signals. This concept is demonstrated below in Figure 10.

Figure 10: Brain Areas Engaged Based on Emotion Category of Music [P.19].

Different brain areas are engaged according to the emotion category of music
in different colors – joyous music (red), tense music (yellow), and sad music
(blue). Based on conclusive results and statistically significant data, the type
of musical input is allocated to different parts of the brain connected by a
bilateral fronto-parietal network [P.19]. From a structural, cross-section view of
the cerebral cortex, it is clear that music in the brain can alter our perception
and emotional response based on the area it activates. For example, music with
a fast tempo and a major mode tend to evoke a positive/happy response, but a
slow tempo induces a negative/sad mood. The functional standpoint of listening
to music over time is an increase in the brain’s alpha waves that are associated
with relaxation and a calm state of mind [V.18]. The brain’s alpha waves are
also robustly responsible for human cognition and emotions, which generates
distinct physiological and psychological effects on the body [V.18].
Additionally, music causes the release of certain neurotransmitters, which
evokes important emotions, memories, and feelings. For instance, dopamine
is released in the mesolimbic reward system while listening to music, which
increases the body’s natural reward sensation. Serotonin, a neurotransmitter
involved in mood regulation and learning, also increases in the presence of audi-

15

249
tory stimuli. Higher concentrations of dopamine and serotonin in the caudate-
putamen and nucleus accumbens (areas linked to reward and motor control)
signify that music has a direct impact on the synaptic activity of these brain
areas and the amount of neurotransmitters released in a healthy manner.

3.2 Musical Activation of Various Brain Areas


Many structures are engaged in the process of musical activation, which is the
stimulation of the brain’s cerebral cortex into a state of alertness/attention.

Figure 11: Regression Analysis Correlation (rCBF) of Neuroanatomical Regions


Scanned After Exposure to Music [J.01].

Through the help of Magnetic Resonance Imaging (MRI) scans, Figure 11


depicts various brain parts engaged due to musical activation. These include
the left dorsomedial midbrain (Mb), bilateral cerebellum (Cb), right thalamus
(Th), left ventral striatum (VStr), and the hippocampus/amygdala (H/Am).
The amygdala, especially, plays an important role in the body’s perception of
music. As a part of the brain’s limbic system, the processing for emotions, fear,
and aggression originate from the amygdala – a small, almond-shaped struc-
ture. The amygdala’s gray matter volume (GMV) is directly correlated with an
individual’s melodic interval perception, which is the time taken to differentiate
two successive events [J.14]. In music, interval perception refers to how we per-
ceive the pitch gap between two notes played successively. This suggests that
the optimal GMV of the amygdala is an integral part of both perceiving and
processing the emotions generated by music. To further support this claim, peo-
ple with a decreased amygdala GVM portrayed impaired emotional responses
to music such as not recognizing sad or fearful music and not demonstrating a
sense of pleasure when listening to pleasant music [J.14].
Another structure of the limbic system is the hippocampus, which is also
activated by the presentation of music. This structure is fundamental in the
consolidation of both short-term and long-term memory and is involved in state

16

250
regulation, motivation, defensive behavior, and anxiety in response to specific
stimuli [D.06]. Typically, the hippocampus is associated with the processing
of unpleasant (permanently dissonant) music compared to pleasant (consonant)
music [D.06]. Just how some brain structures can perceive components like
rhythm and pitch, the hippocampus recognizes harmonious sounds and sepa-
rates them from unstable and tense sounds.
To add, more neuroimaging studies indicate that an overlap in musical ac-
tivation occurs in the superior temporal gyrus (STG), middle temporal gyrus,
middle frontal gyrus, parietal lobe, supplementary motor area, and premotor
cortex [X.19]. During music listening, both the left and right brain hemispheres
are activated, and the right temporal cortex is even involved in the perception
of pitch patterns [X.19]. Clearly, the underlying workings of music is a neu-
rologically ubiquitous process with differing structures involved in an attentive
brain.

3.3 The Difference Between Classical and Non-classical


Music
There has been a widespread interest in music’s effect on the brain, and there
exists a plethora of beneficial reasons why music is nature’s own “medicinal
treatment”. Nevertheless, is this due to music as a whole or – more exactly – a
type of music? This section will analyze the differences among classical music
and non-classical music and determine which one reaps the most advantages for
the brain and the body.
One concept that delves into classical music’s influence on the brain is the
Mozart effect. This phenomenon is observed as an improvement in certain
brain functions and abilities from repeatedly listening to classical pieces com-
posed by the popular 19th century musician, Wolfgang Amadeus Mozart. In a
study where subjects listened to Mozart’s Sonata for 2 Pianos, it was concluded
that participants demonstrated significantly greater spatial-reasoning skills com-
pared to periods of listening to relaxation instructions attempting to decrease
blood pressure [S.01]. Spatial reasoning is simply the ability to comprehend and
effectively draw conclusions. Sonata for 2 Pianos is also proven to drastically
increase relative alpha band power, which increases the brain’s alpha wave pat-
terns [R.18]. Once again, an increase in alpha wave patterns is associated with
an alert and relaxed state of mind.
Nevertheless, Mozart’s music is not the only form of classical music that
demonstrates these improvements in brain functions. In one study, after lis-
tening to either Gustav Mahler’s Adagietto Symphony 5, white noise, or no
music, participants’ semantic memory recollection increased when listening to
the classical piece compared to the white noise and no music [E.14]. Semantic
memory is a type of long-term memory involved in remembering words, con-
cepts, and permanent knowledge such as languages. The mean values for the
cognitive task associated with semantic memory was 39.90 for the Mahler group
but 36.39 and 38.34 for the white noise and no music groups, respectively [E.14].
The higher value for the Mahler group indicates that classical music has its own

17

251
set of benefits that cannot be derived from other types of music. Furthermore,
music with a long-term periodicity, whether of Mozart or other classical com-
posers, resonates within the brain to enhance spatial-temporal performance and
even decrease seizure activity [S.01]. For instance, Greek-American musician
Yanni’s compositions – similar to those of Mozart’s Sonatas in tempo, melody,
harmony, and structure – were also effective and reproduced the exact results
that improved cognitive abilities like reasoning and memory [S.01]. However,
the effects of music may not be dependent on a specific piece. Even though ro-
bust evidence demonstrates classical music’s ability to improve cognitive skills
such as memory and learning, music that is personally liked by subjects turns
out to enhance alpha wave and beta wave frequencies in the temporal brain
regions as well [R.18]. Therefore, non-classical music such as rock, pop, hiphop,
rhythm and blues, and even jazz could potentially generate the same results;
this would greatly depend on the preferences of the individual though.

3.4 The Purpose of Music Therapy


Music Therapy (MT) is an art-based intervention which utilizes music expe-
riences within a therapeutic manner to address patients’ physical, cognitive,
emotional, and social needs [M.19a]. Both participants and music therapists
interact within a structured framework, which concludes in conversations about
the patient’s emotions and/or experiences [C.17]. MT is known to alleviate a
patient’s symptoms and improve their quality of life especially in fields of health
care such as neonatology, neurodegeneration, and pediatric oncology [C.17]. Re-
garding the brain, the influence of MT results in mood improvement, enhanced
cognitive functions in memory, and provides a sense of connection for patients
who may feel alone [L.23]. Moreover, MT is a gradual process as results are not
observed in just a few days. After a few weeks of prolonged exposure, though,
a significant increase in the brain’s alpha waves indicates that a novel form of
art-based therapy is something that participants habituate and adapt to [V.18].
As weeks go by, music-listening groups undergoing MT even experience more
joyous and relaxed emotions [V.18].
Even though there aren’t effective treatments developed for different neu-
rodegenerative diseases such as AD, MT and other music-related non-pharmacological
therapies have garnered more attention as a method to boost cognitive and be-
havioral functions. For one, MT induces plastic changes in a few brain networks;
this is a process known as neuroplasticity [L.23]. Neuroplasticity is a unique pro-
cess that involves adaptive structural and functional alterations to the brain.
This concept is of extreme importance since it allows the brain to reorganize
itself and change its activity in response to distinct stimuli after injuries such
as a stroke.
To summarize, the impact of music, especially classical music, on the brain is
remarkably pervasive. It engages numerous brain areas, triggers the release of a
greater quantity of neurotransmitters, and has the potential to reshape the brain
itself. Given music’s proven effectiveness in improving cognitive functions like
spatial reasoning and memory, it is imperative to explore and research whether

18

252
classical music, as a non-pharmacological intervention for AD patients, could
ameliorate the neuropsychiatric and cognitive symptoms of AD.

4 Alzheimer’s Disease and the Influence of Clas-


sical Music
AD is a formidable and deeply puzzling neurological disease that proceeds to
challenge medical research and potential therapeutic interventions. As health-
care professionals and scientists strive to unravel the complexities of this dis-
order, unconventional and non-pharmacological avenues for exploration have
emanated. In the midst of these, classical music intervention in AD has capti-
vated the interest of many. Not only does this convergence prompt questions
regarding curative advantages, but it also delves into the extreme effect classical
music may have on NPS, cognitive symptoms, and the overall quality of life for
AD patients. In this section, the multifaceted interrelation between AD and
classical music will be discussed, seeking answers to uncover the ways in which
art may offer hope to AD patients.

4.1 Classical Music’s Effect on Neuropsychiatric Mecha-


nisms of Alzheimer’s
To reiterate, NPS are defined as non-cognitive disturbances pertaining to men-
tal health defects and psychiatric disorders. Some of the NPS that were detailed
in this paper include apathy, depression, aggression/agitation, and sleep disor-
ders. In this section, classical music’s effect – especially Mozart’s effect – on the
neuropsychiatric mechanisms of AD will be discussed.
To begin, it is well-known that glimpses of aggression, confusion, and agitation
in elderly individuals with AD are significant problems for both the patients and
their caregivers. The soothing melodies and captivating rhythms have demon-
strated promise in lessening aggressive outbursts of anger and frustration. In
patients with Alzheimer’s Disease and other related dementias (ADRD), cogni-
tive impairment plays a significant role in triggering aggression.

19

253
Figure 12: Gerdner’s Mid-Range Theory of Music Intervention for Agitation
[L.05].

Figure 12 illustrates the concepts underlying Gerdner’s Mid-Range Theory,


which asserts that cognitive decline is key to elevating agitation. The precur-
sor to aggression results in the impaired and decreased ability to perceive and
process sensory information, resulting in a lowered stress threshold and an el-
evated anxiety potential [L.05]. Simply, this indicates that, as AD progresses,
fewer stressors are necessary to meet the stress threshold, resulting in agitation.
When testing this popular theory, one research group evaluated the agitation
levels of AD patients after they listened and actively paid attention to classi-
cal “relaxation” music for almost six weeks. The overall change in agitative
measures, words, and actions was recorded using the Modified Cohen-Mansfield
Agitation Inventory. The results of the study concluded that classical music
intervention supports Gerdner’s music theory since a significant reduction in
agitation occurred, which was far less with other music types [L.05].
Another significant NPS of AD is depression, characterized by a deteriora-
tion in emotional, behavioral, and social functions. When applying an arts-
based intercession such as concert classical music, a wide range of previously
deserted feelings are induced in AD patients who are now capable of battling
the hopelessness and sadness present in depressive states. For instance, clas-
sical compositions such as Chopin’s Nocturnes are linked to improved mood,
thereby reducing depressive symptoms. With just five sessions of classical-music
therapy, depression and behavioral problems greatly lessened in people with
ADRD [dRLSBGCGA22].
As previously addressed, the lack of interest/motivation or apathy is another
challenging condition frequently recorded in AD. Classical music possesses the
ability to stimulate emotional responses by releasing certain neurotransmitters
such as dopamine and serotonin, which naturally rekindles attention, engage-
ment, and a general interest in daily activities. After a 12 week classical music
therapy intervention on apathy, AD patients demonstrated a tremendous im-
provement in not just interest levels but also depression, orientation, anxiety,
and aggression [dRLSBGCGA22].
Finally, sleep disruption is also a common problem among older adults, es-
pecially individuals with AD. Quality sleep and a controlled circadian rhythm
serve important restorative functions and indirectly influence our core body
temperature and even the body’s melatonin and cortisol levels [A.21]. Calming,
tailored classical music improves sleep quality in the elderly population with
ADRD because of reduced stress levels and modulated arousal levels [A.21].
Classical music’s therapeutic effects are robustly attributed to a plethora of
neurological mechanisms in AD. Music can actively engage many brain regions
such as the amygdala and the hippocampus of the limbic system even in the more

20

254
advanced stages of this deadly disease. From alleviating depression, apathy,
and agitation to improving sleep patterns, classical music therapy should be
encouraged and conducted by trained professionals in AD senior homes in order
to offer fragments of comfort in the midst of uncertainty.

4.2 The Cognitive Benefits of Classical Music in Alzheimer’s


AD presents a series of challenges to both neuropsychiatric and cognitive func-
tion, gradually stripping away an individual’s memories and reasoning abilities.
Cognitive functions refer to anything related to the processes of thinking, learn-
ing, problem-solving, etc. While no treatment for AD exists, ongoing research
sheds light into the therapeutic influence of classical music on the cognitive
symptoms of AD, especially through the Mozart effect.
When we recognize a familiar tune or common melody in public, we are quick
to connect a moment from our past to that song due to memories. In cases of
AD, musical memory is usually the last to erode away compared to semantic and
episodic memory. For instance, a 92-year-old woman with dementia was able
to recall the three movements of Beethoven’s Moonlight Sonata ranging from
14-16 minutes in length. This begs to address the discovery that brain regions
associated with musical memory are the last to be disrupted in Alzheimer’s.
There exists a functional neuroanatomical foundation for the vulnerability of
all memories in AD including musical memory. Therefore, musical memory
serves as an informative concept of the neural networks harmed in AD. Patients
with AD perform better on tasks involved in recognition memory and semantic
memory when classical music is accompanied by a spoken recording [A.10]. As
previously mentioned, music processing involves a matrix of neural networks,
recruiting from many brain areas such as the basal ganglia, hypothalamus, and
amygdala [A.10]. By stimulating memory circuits in the brain, classical music
underscores the importance of music therapy for AD patients.
Finally, the deterioration of reasoning skills such as spatial-reasoning is an-
other major cognitive symptom of AD. Spatial reasoning is defined as the ability
to understand and manipulate visual information and stimuli, and it plays a sig-
nificant role in problem solving and other daily tasks. Even though AD presents
hurdles for an AD patient’s reasoning ability, the science behind classical music’s
intricate mechanisms can engage various brain areas involved in reasoning and
comprehension. Just after listening to Mozart’s Sonata for Two Pianos for 10
minutes, AD patients demonstrated significantly better spatial reasoning, which
is a common outcome of the Mozart effect [S.01]. Furthermore, several phys-
iological pathways are activated in response to classical music stimuli, which
modulates body responses like improved reasoning [M.14]. These results, how-
ever, are not demonstrated through the Mozart effect alone. The Schubert effect
is another widespread classical music topic similar to Mozart’s effect, which also
results in a better performance to spatial tasks after a period of time [M.14].
In the realm of AD, where cognitive decline lowers a patient’s quality of
life, classical music emerges as an inspiration to enhance memory and reasoning
skills. Even though this intervention is not pharmacological in nature, the

21

255
symphonies of classical music are proven to provide solace and relief for both
the neuropsychiatric and cognitive mechanisms of AD.

5 Conclusion
Alzheimer’s is one of the extreme medical mysteries in the healthcare field with
unanswered questions regarding its etiology, pathology, and diagnosis. Nonethe-
less, it is clear that various genetic and environmental risk factors such as age,
PSEN mutations, gender, obesity, T2D, and CVD are somewhat responsible for
the progression and development of AD. In terms of AD’s pathology, several
distinctive signs such as AB plaques, NFTs, elevated ROS levels, and oxidative
stress are extremely prominent.
After a thorough examination of the effects of classical music on the pro-
gression of Alzheimer’s, the implementation of classical music therapy in senior
homes and assisted living care centers is of utmost importance. The researched,
observed effects of classical music intervention result in enhanced memory, re-
duced NPS, and redeveloped cognitive abilities. Classical music therapy’s ability
to alleviate NPS such as depression and agitation as well as cognitive symptoms,
including learning defects and reduced spatial reasoning, underscores the value
of music as a non-pharmacological intervention that effectively improves an AD
patient’s quality of life alongside existing treatments.

22

256
References
[2A 18] About alzheimer’s disease: Promoting health and indepen-
dence for an aging population. Centers for Disease Control
and Prevention, 2018.

[3A 15] Amyloid plaques and neurofibrillary tangles. Alzheimer’s


Disease Research at BrightFocus Foundation, 2015.

[702 1] Study reveals how apoee4 gene may increase risk for demen-
tia. National Institute on Aging, 2021.

[A.10] Simmons-Stern N. R. Budson A. E. Ally B. A. Music as a


memory enhancer in patients with alzheimer’s disease. Neu-
ropsychologia, 2010.

[A.13] Mergenthaler P. Lindauer U. Dienel G. A. Meisel A. Sugar


for the brain: the role of glucose in physiological and patho-
logical brain function. Trends in neurosciences, 2013.
[A.17] Pizzino G. Irrera N. Cucinotta M. Pallio G. Mannino F. Ar-
coraci V. Squadrito F. Altavilla D. Bitto A. Oxidative stress:
Harms and benefits for human health. Oxidative medicine
and cellular longevity, 2017.

[A.18] Weller J. Budson A. Current understanding of alzheimer’s


disease diagnosis and treatment. F1000Research, 2018.

[A.20] Lloret M. A. Cervera-Ferri A. Nepomuceno M. Monllor P.


Esteve D. Lloret A. Is sleep disruption a cause or con-
sequence of alzheimer’s disease? reviewing its possible role
as a biomarker. International journal of molecular sciences,
2020.

[A.21] Petrovsky D. V. Ramesh P. McPhillips M. V. Hodgson N.


A. Effects of music interventions on sleep in older adults: A
systematic review. Geriatric nursing, 2021.
[C.06] Kumar-Singh S. Theuns J. Van Broeck B. Pirici D. Ven-
nekens K. Corsmit E. Cruts M. Dermaut B. Wang R.
Van Broeckhoven C. Mean age-of-onset of familial alzheimer
disease caused by presenilin mutations correlates with both
increased aB42 and decreased aB40. Human mutation, 2006.

[C.17] Aalbers S. Fusar-Poli L. Freeman E. R. Spreen M. Ket CF J.


Vink C. A. Maratos A. Crawford M. Chen X. Gold C. Music
therapy for depression. Cochrane Database of Systematic
Reviews, 2017.

23

257
[C.18a] Gonzalez C. Armijo E. Bravo-Alegria J. Becerra-Calixto A.
Mays C. E. Soto C. Modeling amyloid beta and tau pathol-
ogy in human cerebral organoids. Mol Psychiatry, 2018.

[C.18b] Hamlett E. D. Ledreux A. Potter H. Chial H. J. Patterson D.


Espinosa J. M. Bettcher B. M. Granholm A. C. Exosomal
biomarkers in down syndrome and alzheimer’s disease. Free
radical biology medicine, 2018.
[C.19] Demir E. A. Tutuk O. Dogan H. Tumer C. Depression in
alzheimer’s disease: The roles of cholinergic and serotonergic
systems. National Library of Medicine, 2019.

[C.21] Burillo J. Marqués P. Jiménez B. Gonzalez Blanco C. Benito


M. Guillen C. Insulin resistance and diabetes mellitus in
alzheimer’s disease. Cells, 2021.

[D.06] Koelsch S. Fritz T. V Cramon D. Y. Maller K. Friederici A.


D. Investigating emotion with music: an fmri study. Human
brain mapping, 2006.
[D.16] Gordon D. Sounding the alarm on a future epidemic:
Alzheimers disease. UCLA Newsroom, 2016.
[dLMJBRD18] Mosconi L. Rahman A. Diaz I. Wu X. Scheyer O. Hristov H.
W. Vallabhajosula S. Isaacson R. S. de Leon M. J. Brinton
R. D. Increased alzheimer’s risk during the menopause tran-
sition: A 3-year longitudinal brain imaging study. PloS one,
2018.
[dRLSBGCGA22] da Rocha L.A. Siqueira B.F. Grella C.E. Gratao A.C.M. Ef-
fects of concert music on cognitive, physiological, and psy-
chological parameters in the elderly with dementia: a quasi-
experimental study. Dementia neuropsychologia, 2022.

[E.08] Gold C. A. Budson A. E. Memory loss in alzheimer’s disease:


implications for development of therapeutics. Expert review
of neurotherapeutics, 2008.

[E.14] Bottiroli S. Rosi A. Russo R. Vecchi T. Cavallini E. The


cognitive effects of listening to background music on older
adults: processing speed improves with upbeat music, while
memory seems to benefit from both upbeat and downbeat
music. Frontiers in aging neuroscience, 2014.

[E.17] Tonnies E. Trushina E. Oxidative stress synaptic dysfunc-


tion and alzheimer’s disease. Journal of Alzheimer’s disease,
2017.

24

258
[E.18] Morgan K. N. Derby C. A. Gleason C. E. Cognitive changes
with reproductive aging perimenopause and menopause. Ob-
stetrics and gynecology clinics of North America, 2018.

[E.19] Lott I.T. Head E. Dementia in down syndrome: unique in-


sights for alzheimer disease research. Nat Rev Neurol, 2019.

[E.22a] Carotenuto A. Fasanaro A. M. Manzo V. Amenta F. Traini


E. Association between the cholinesterase inhibitor donepezil
and the cholinergic precursor choline alphoscerate in the
treatment of depression in patients with alzheimer’s disease.
Journal of Alzheimer’s disease reports, 2022.

[E.22b] Klein S. Gastaldelli A. Yki-JÃrvinen H. Scherer P. E. Why


does obesity cause diabetes?. Cell metabolism, 2022.

[EI.17] Lukiw WJ. Rogaev EI. Genetics of aggression in alzheimer’s


disease (ad). Front Aging Neuroscience, 2017.

[G.06] Starkstein S. E. Jorge R. Mizrahi R. Robinson R. G. A


prospective longitudinal study of apathy in alzheimer’s dis-
ease. Journal of neurology neurosurgery and psychiatry,
2006.
[G.18] Brzecka A. Leszek J. Ashraf G. M. Ejma M. Avila-Rodriguez
M. F. Yarla N. S. Tarasov V. V. Chubarev V. N. Samsonova
A. N. Barreto G. E. Aliev G. Sleep disorders associated with
alzheimer’s disease a perspective. Frontiers in neuroscience,
2018.
[G.21] Leszek J. Mikhaylenko E. V. Belousov D. M. Koutsouraki E.
Szczechowiak K. Kobusiak-Prokopowicz M. Mysiak A. Diniz
B. S. Somasundaram S. G. Kirkland C. E. Aliev G. The
links between cardiovascular diseases and alzheimer’s disease.
Current neuropharmacology, 2021.
[G.22] Martens Y. A. Zhao N. Liu C. C. Kanekiyo T. Yang A. J.
Goate A. M. Holtzman D. M. Bu G. Apoe cascade hypoth-
esis in the pathogenesis of alzheimer’s disease and related
dementias. Neuron, 2022.

[H.13] Jahn H. Memory loss in alzheimer’s disease. Dialogues in


clinical neuroscience, 2013.
[J.01] Blood A. J. Zatorre R. J. Intensely pleasurable responses to
music correlate with activity in brain regions implicated in
reward and emotion. Proceedings of the National Academy
of Sciences of the United States of America, 2001.

25

259
[J.03] Terry A. V. Buccafusco J. J. The cholinergic hypothesis of
age and alzheimer’s disease-related cognitive deficits: Recent
challenges and their implications for novel drug development.
The Journal of Pharmacology and Experimental Therapeu-
tics, 2003.
[J.14] Li X. De Beuckelaer A. Guo J. Ma F. Xu M. Liu J. The
gray matter volume of the amygdala is correlated with the
perception of melodic intervals: a voxel-based morphometry
study. PloS one, 2014.
[J.15] Guerreiro R. Bras J. The age factor in alzheimer’s disease.
Genome medicine, 2015.
[J.17] Gomez Gallego M. Gomez Garcia J. Music therapy
and alzheimer’s disease: Cognitive, psychological, and be-
havioural effects. Neurology, 2017.
[K.15] Klimova B. Maresova P. Valis M. Hort J. Kuca K.
Alzheimer’s disease and language impairments: social inter-
vention and medical treatment. Clinical interventions in ag-
ing, 2015.
[KH12] Brieger K. Schiavone S. Miller Jr. J.F. Krause KH. Reac-
tive oxygen species: from health to disease. Swiss Medical
Weekly, 2012.
[L.05] Gerdner L. Effects of individualized versus classical relax-
ation music on the frequency of agitation in elderly persons
with alzheimer’s disease and related disorders. Cambridge
University Press, 2005.
[L.18] Scheyer O. Rahman A. Hristov H. Berkowitz C. Isaacson R.
S. Diaz Brinton R. Mosconi L. Female sex and alzheimer’s
risk: The menopause connection. The journal of prevention
of Alzheimer’s disease, 2018.
[L.23] Bleibel M. El Cheikh A. Sadier N. S. Abou-Abbas L. The ef-
fect of music therapy on cognitive functions in patients with
alzheimer’s disease: a systematic review of randomized con-
trolled trials. Alzheimer’s Research and Therapy, 2023.
[M.14] Pauwels E. K. Volterrani D. Mariani G. Kostkiewics M.
Mozart music and medicine. Medical principles and prac-
tice : international journal of the Kuwait University Health
Science Centre, 2014.
[M.16a] Ferreira-Vieira T. H. Guimaraes I. M. Silva F. R. Ribeiro F.
M. Alzheimer’s disease: Targeting the cholinergic system.
Current neuropharmacology, 2016.

26

260
[M.16b] Lubitz I. Ricny J. Atrakchi-Baranes D. Shemesh C. Kravitz
E. Liraz-Zaltsman S. Maksin-Matveev A. Cooper I. Lei-
bowitz A. Uribarri J. Schmeidler J. Cai W. Kristofikova
Z. Ripova D. LeRoith D. Schnaider-Beeri M. High di-
etary advanced glycation end products are associated with
poorer spatial learning and accelerated aB deposition in an
alzheimer mouse model. Aging cell, 2016.

[M.18] Nobis L. Husain M. Apathy in alzheimer’s disease. Current


opinion in behavioral sciences, 2018.

[M.19a] Stegemann T. Geretsegger M. Phan Quoc E. Riedl H.


Smetana M. Music therapy and other music-based inter-
ventions in pediatric health care: An overview. Medicines
Basel Switzerland, 2019.

[M.19b] Tiwari S. Atluri V. Kaushik A. Yndart A. Nair M.


Alzheimer’s disease: pathogenesis, diagnostics, and thera-
peutics. Dovepress, 2019.

[M.20] Gomez W. Morales R. Maracaja-Coutinho V. Parra V. Nas-


sif M. Down syndrome and alzheimer’s disease: common
molecular traits beyond the amyloid precursor protein. Ag-
ing, 2020.
[M.21a] Fortea J. Zaman S. H. Hartley S. Rafii M. S. Head E.
Carmona-Iragui M. Alzheimer’s disease associated with
down syndrome: a genetic form of dementia. The Lancet.
Neurology, 2021.

[M.21b] Soufineyestani M. Khan A. Sufineyestani M. Impacts of mu-


sic intervention on dementia: A review using meta-narrative
method and agenda for future research. Neurology interna-
tional, 2021.
[M.22] Eickhoff S. Franzen L. Korda A. Rogg H. Trulley V. N. Borg-
wardt S. Avram M. The basal forebrain cholinergic nuclei
and their relevance to schizophrenia and other psychotic dis-
orders. Frontiers in psychiatry, 2022.

[ML14] Barbagallo M. and Dominquez J. L. Type 2 diabetes mellitus


and alzheimer’s disease. Baishideng Publishing Group, 2014.

[MM21] Geula C. Dunlop S. R. Kawles A. S. Flanagan M. E. Gefen


T. Mesulam M.-M. Basal forebrain cholinergic system in the
dementias: Vulnerability, resilience, and resistance. Journal
of Neurochemistry, 2021.

27

261
[O.14] Gouras G. K. Olsson T. T. Hansson O. AB-amyloid peptides
and amyloid plaques in alzheimer’s disease. Neurotherapeu-
tics, 2014.

[O.18] Gulisano W. Maugeri D. Baltrons M. A. FÃ M. Amato A.


Palmeri A. D’Adamio L. Grassi C. Devanand D. P. Honig L.
S. Puzzo D. Arancio O. Role of amyloid-B and tau proteins
in alzheimer’s disease: Confuting the amyloid cascade. 2018.
[P.19] Fernandez N. B. Trost W. J. Vuilleumier P. Brain networks
mediating the influence of background music on selective at-
tention. Social cognitive and affective neuroscience, 2019.

[R.97] Blacker D. Haines J. L. Rodes L. Terwedow H. Go R. C. Har-


rell L. E. Perry R. T. Bassett S. S. Chase G. Meyers D. Al-
bert M. S. Tanzi R. Apoe-e4 and age at onset of alzheimer’s
disease. American Academy of Neurology Journal, 1997.
[R.12] Kopan R. Notch signaling. Cold Spring Harbor perspectives
in biology, 2012.
[R.18] KuÄikienÄ— D. PraninskienÄ— R. The impact of music
on the bioelectrical oscillations of the brain. Acta medica
Lituanica, 2018.

[R.20] Breijyeh Z. Karaman R. Comprehensive review on


alzheimer’s disease: Causes and treatment. Molecules, 2020.

[R.22] Mangalmurti A. Lukens J. R. How neurons die in alzheimer’s


disease: Implications for neuroinflammation. Current opin-
ion in neurobiology, 2022.

[rSJ17] Kelleher R. J. 3rd Shen J. Presenilin-1 mutations and


alzheimer’s disease. Proceedings of the National Academy
of Sciences of the United States of America, 2017.

[S.01] Jenkins J. S. The mozart effect. Journal of the Royal Society


of Medicine, 2001.

[S.11] Lyketsos C. G. Carrillo M. C. Ryan J. M. Khachaturian


A. S. Trzepacz P. Amatniek J. Cedarbaum J. Brashear R.
Miller D. S. Neuropsychiatric symptoms in alzheimer’s dis-
ease. Alzheimer’s dementia : the journal of the Alzheimer’s
Association, 2011.
[S.12] De Strooper B. Iwatsubo T. Wolfe M. S. Presenilins and Y-
secretase: structure, function, and role in alzheimer disease.
Cold Spring Harbor perspectives in medicine, 2012.

28

262
[S.16] Love S. Miners J. S. Cerebrovascular disease in aging and
alzheimer’s disease. Acta neuropathologica, 2016.

[S.18] Hampel H. Mesulam M. M. Cuello A. C. Farlow M. R. Gi-


acobini E. Grossberg G. T. Khachaturian A. S. Vergallo A.
Cavedo E. Snyder P. J. Khachaturian Z. S. The cholinergic
system in the pathophysiology and treatment of alzheimer’s
disease. Brain : a journal of neurology, 2018.
[S.19a] Dulawa S. C. Janowsky D. S. Cholinergic regulation of
mood: from basic and clinical studies to emerging therapeu-
tics. Molecular psychiatry, 2019.

[S.19b] Yu R. Topiwala A. Jacoby R. Fazel S. Aggressive behaviors


in alzheimer disease and mild cognitive impairment: Sys-
tematic review and meta-analysis. The American journal of
geriatric psychiatry : official journal of the American Asso-
ciation for Geriatric Psychiatry, 2019.

[T.17] Sarkamo T. Cognitive, emotional, and neural benefits of


musical leisure activities in aging and neurological rehabili-
tation: A critical review. Science Direct, 2017.

[T.21] Serrano-Pozo A. Das S. Hyman B. T. Apoe and alzheimer’s


disease: advances in genetics pathophysiology and therapeu-
tic approaches. The Lancet. Neurology, 2021.

[V.18] Nawaz R. Nisar H. Voon Y. V. The effect of music on hu-


man brain; frequency domain and time series analysis using
electroencephalogram. IEEE Xplore, 2018.

[X.19] Ding Y. Zhang Y. Zhou W. Lin Z. Huang J. Hong B. Wang


X. Neural correlates of music listening and recall in the hu-
man brain. The Journal of neuroscience : the official journal
of the Society for Neuroscience, 2019.
[Z.13] Gu L. Guo Z. Alzheimer’s aB42 and aB40 peptides form
interlaced amyloid fibrils. Journal of neurochemistry, 2013.

29

263
Comparing the Effectiveness of Support Vector
Classifier and Stochastic Gradient Descent in
Hate-Speech Detection

Dania Noman Ali
October 17, 2023

Abstract
The increased use of Social Media with easy access to most people in
the world has given rise to a multitude of problems; with cyberbullying
and online hate-speech standing out as significant issues. With the choice
of a user to maintain there anonymity and post most things that would
be considered uncivil in a one-to-one real life conversation, has led to a
widespread dissemination of online hate-speech, posing significant societal
challenges and determinantal effects to an individual’s mental health. In
this paper, we explored two simple Classifiers, Support Vector Classifier
(SVC) and Stochastic Gradient Descent (SGD) which are compared and
analysed through there accuracy score to determine there effectiveness in
detecting hate-speech within the context of Twitter data. To train the
models, a publicly available dataset by Analytics Vidhya which can be
found on Kaggle.com is used which contains 32k tweets labelled with a ‘1’
if it is sexist/racist or ‘0’ if it’s not. The goal of this paper is identifying the
differences in performances in hate-speech detection by the two classifiers
In Latex there are three different types of headings: sections, subsections,
and subsubsections. Below you can see examples of how to make sections,
subsections, and subsubsections.

1 Introduction
Cyberbullying, predominately in the form of hate-speech is a widespread phe-
nomenon especially in the context of Twitter tweets. Sharing an individual’s
opinion with billions of people all around the globe, with the option to stay com-
pletely unknown has led Social Media to be a safe haven for the propagation
of hate-speech using remarks that might be sexist/racist, derogatory against
certain ethnicities, targeting religious minorities and/or defaming another indi-
vidual based off there certain characteristics [1].
∗ Advised by: Maria Konte of Georgia Institute of Technology

264
In light of its latest rebranding to ’X’, the social media platform has accrued
a substantial user base, boasting approximately 450 million active participants.
Official reports from Twitter indicate that these users collectively contribute
to an average daily volume of approximately 500 million tweets. Each user,
on average, invests approximately 30.9 minutes of their daily activity on the
platform.
Notably, the scale of content generation on this platform is significant, with
users capable of generating up to 2400 posts per day. It is worth emphasizing
that this disproportionately large volume of user-generated content is dissemi-
nated with minimal to no pre-posting filtration or content moderation measures
in place, rendering the platform susceptible to the proliferation of hate speech
and other forms of harmful content. Social media companies invest millions of
dollars in dealing with such issues, which mostly includes manual moderation
and deleting posts/tweets that are deemed as ‘offensive’, hateful references, in-
citement, slurs and tropes, dehumanization and hateful imagery [2]. A study
by the European Sociological Review, investigates the impact of perceived so-
cial acceptability on online hate speech and suggests that interventions based
on descriptive norms, such as moderate censorship, can significantly reduce
hate speech and guide future interventions in online communities to prevent the
spread of hate [3]. Therefore, the goal of this paper is to investigate and com-
pare the effectiveness of hate-speech detection by the two classifiers, namely,
Support Vector Classifier and Stochastic Gradient Descent.

2 Text Classification
Text Classification is an important task under Natural Language Processing
(NLP) whose primary objective being the automated allocation of text into
predetermined categories. Examples of tasks that could be achieved by text
classification are:

• Sentiment Analysis
• Classifying Emails as Spam or Non-spam

• Automation of answering queries of customers

• Categorizing News articles according to there topics

Text classification falls under the category of supervised learning, which means it
relies on a dataset where each document is labelled with its respective category.
This labelled data is used to train a classifier, which can then categorize new
text documents accordingly. This process forms the basis for many text-related
tasks and applications. For this research paper, we would be delving into two
types of Classifiers, Support Vector Classifier (SVM) and Stochastic Gradient
Descent (SGD). The aim of this paper is to find out which classifiers performs
better in hate-speech detection. Stochastic Gradient Descent

265
Figure 1: SGD frequently updates with significant variations, resulting in sub-
stantial fluctuations in the objective function, as depicted in Image 1:

2.1 Stochastic Gradient Descent


Stochastic Gradient Descent (SGD) is a variation of the Gradient Descent al-
gorithm employed in machine learning optimization. It specifically targets the
inefficiencies that arise when dealing with extensive datasets in machine learn-
ing projects. In the case of SGD, rather than using the entire dataset in each
iteration, it selects a single random training example or a small batch to cal-
culate the gradient and update the model parameters. This random selection
introduces an element of randomness into the optimization process, hence the
term ”stochastic” in its name. The primary advantage of using SGD lies in its
computational efficiency, especially when working with large datasets. It sub-
stantially reduces the computational cost per iteration compared to traditional
Gradient Descent methods that require processing the entire dataset. Moreover,
SGD excels in online learning scenarios, where data streams continuously, en-
abling real-time model adaptation with incremental updates. Here are the key
steps involved in the SGD process: 1. Initialization: The model’s parameters
are randomly initialized. 2. Setting Parameters: You determine the number of
iterations and the learning rate (alpha) for parameter updates. 3. Stochastic
Gradient Descent Loop: The following steps are repeated until the model con-
verges or reaches the maximum number of iterations: a. Shuffle the training
dataset to introduce randomness. b. Iterate over each training example (or a
small batch) in the shuffled order. c. Compute the gradient of the cost function
concerning the model parameters using the current training example (or batch).
d. Update the model parameters by taking a step in the direction opposite to
the gradient, scaled by the learning rate. e. Assess convergence criteria, such
as differences in the cost function between gradient iterations. 4. Return Op-
timized Parameters: Once the convergence criteria are met or the maximum
iterations are reached, the optimized model parameters are returned.
SGD enhances efficiency by updating parameters one at a time, making it
considerably faster and suitable for online learning applications. [4,5,6,7].

266
Figure 2: How an SVC Works - Source:
©SHUTTERSTOCK.COM/SIDHARTHA CARVALHO

tweet labels
would you please ask these shameless @user @user give 1
goodnight my friends... stay blessed and highly favored!!! thursday fitfam 0
black women demonic porn 1
vandals turned a jewish family’s menorah into a swastika” antisemitism hate 1
i’m pretty sure that warm weather and sun is my meditation sunshine meditate quietà 0

Table 1: Examples of tweets from the dataset and there corresponding labels,
where it has a value of ‘1’ if the tweet contains hate-speech (which is defined as
sexist/racist remarks for the simplicity of this dataset), and a label of ‘0’ if it
doesn’t.

2.2 Support Vector Classifier


2.2 Support Vector Classifier is a special case of the much broader, Support
Vector Machine, that is primarily focused on classification tasks. SVC is a
type of Supervised Learning, whose main goal is to find a hyperplane that best
separates two classes or more classes (Binary or multiclass classification). SVCs
work by maximizing the margin between the decision boundary (hyperplane)
and the nearest data points from each class. These nearest data points are
called support vectors. SVCs can handle both linear and nonlinear classification
problems, depending on the choice of kernel function

3 Preparation of Dataset
The dataset used for this paper is from Kaggle.com, provided by Analytics
Vidhya, named Twitter Sentiment Analysis. The dataset contains around 32,000
tweets, where label ’1’ denotes the tweet is racist/sexist and label ’0’ denotes
the tweet is not racist/sexist. The usernames in this dataset are replaced by
@user for the sake of copyright issues. This dataset was chosen due to large
number of tweets, to better test the ability of the two classifiers. [8]

267
Figure 3: Shows the cleaned tweets after preprocessing. This is essential for
preparing Twitter text data for subsequent analysis or modeling by eliminating
noise and unwanted characters from the tweets.

3.1 Support Vector Classifier


3.1.1 Preprocessing the Data
For the SVC Model, the data was cleaned by the ‘tweet-preprocessor’ library.
It begins by installing the tweet-preprocessor library and imports essential li-
braries, including re for regular expressions and preprocessor (aliased as p) for
tweet preprocessing. The regular expressions REPLACE NO SPACE and RE-
PLACE WITH SPACE are defined to facilitate text cleaning. The custom func-
tion clean tweets is the core of the preprocessing pipeline. Given a DataFrame
as input, it iterates through each tweet, utilizing tweet-preprocessor to remove
Twitter-specific elements such as URLs and mentions.

3.1.2 Splitting The Data


The final step is to split the data into a training set and an evaluation set by the
use of ‘train test split’ function. The training set denoted by ‘x train’ contains
the cleaned tweet values and the corresponding target labels are denoted by
the ‘y train’. The testing set, x test, contains tweet values for evaluation, and
y test has the respective target labels. The stratify=y parameter ensures a
similar class distribution between the original dataset and the splits, useful for
classification tasks. A random seed, random state=1, ensures reproducibility,
and test size=0.3 allocates 30

3.1.3 Text Vectorization


Text Vectorization is a process through which numerical values are assigned to
text. Although, there are multiple techniques to employ this method, we would
be using the CountVectorizer() from the scikit-learn library which operates on a
list of text documents stored in the document’s variable. The CountVectorizer()
records the number of repeats or the frequency of the word that occurs and
prepares a matrix. For example, if data was: [”Twitter is fun”, ”I like posting
tweets on twitter”, ”Some people on twitter post tweets that are mean!”] The
matrix would be made as followed:

268
Figure 4: Example of how CountVectorizer() from sklearn works

3.2 Stochastic Gradient Descent


3.2.1 Preprocessing the Data
A Python function is designed to perform the preprocessing for the SGD Classi-
fier. The function, aptly named ’clean-text’, accepts two parameters: a DataFrame
(df) and the name of the text field within the DataFrame (text-field).

1. Text Lowercasing: Initially, the code employs the str.lower() method to


convert all text within the specified ’text field’ to lowercase. This step
ensures uniformity in letter casing, mitigating potential discrepancies in
the text data.
2. Text Cleaning and Tokenization: The subsequent operation is performed
using a lambda function and the re.sub function from the re library. This
operation serves the following purposes:

• Removal of Twitter usernames or mentions (e.g., ”@username”).


• Elimination of non-alphanumeric characters and special symbols from
the text.
• Exclusion of hyperlinks (e.g., ”https://www.example.com”) from the
text.
• Disregard for the ”RT” (retweet) tag at the beginning of a tweet.
• Removal of any remaining URLs.

3. Returning the Processed DataFrame: Finally, the function returns the


DataFrame df with the specified ’text-field’ transformed after the cleaning
and tokenization operations.
This preprocessing function is a critical for the application of the SGD model,
as it ensures that the text data is appropriately formatted and devoid of noise,
enabling the subsequent model to effectively detect hate speech in a consistent
and reliable manner.

3.2.2 Mitigating Class Imbalance through Un-sampling


The dataset that was taken had a majority of label ‘0’ tweets and a class imbal-
ance was encountered. To address this class imbalance, unsampling techniques
were employed which begins by first segregating the training dataset into ma-
jority (which was label ‘0’ tweets - non-hate-speech tweets) in this case, and

269
Figure 5: shows the code that employs the unsampling to mitigate class imbal-
ance

Figure 6: How a TF-IDF works

minority (label ‘1’ - hate-speech tweets). Resampling was done to the minority
class to ensure that the minority and majority has an equal class size. The un-
sampled minority class is then combined with the original minority class, created
a balanced training set. This rebalancing process prevents model bias towards
the majority class and enhances the model’s ability to detect hate-speech.

3.2.3 Text Classification


The pipeline consists of three fundamental stages. First, it employs the CountVec-
torizer to convert raw text data into numerical features, capturing the term fre-
quencies of words within the documents. Following this, the TfidfTransformer
is applied to transform these features into TF-IDF representations, factoring
in the significance of terms across documents. Finally, the pipeline incorpo-
rates the SGDClassifier, a Stochastic Gradient Descent-based linear classifier,
to make accurate predictions. This comprehensive pipeline seamlessly integrates
text data preprocessing and classification, harnessing the TF-IDF methodology
for informative feature extraction and the SGD classifier’s efficiency, making it
well-suited for large-scale text classification tasks, particularly in the context of
hate-speech detection

3.2.4 Splitting the Data


Utilizing the train-test-split function from the sklearn.model-selection module,
the dataset is divided into two distinct subsets: the training set and the testing
set. X-train and y-train capture the feature matrix and corresponding labels of
the training data, essential for training the SGD classifier. Alternatively, X-test

Figure 7: Text Classification Pipeline for Hate-Speech Detection

270
Classifier Accuracy Score F1 Score
Support Vector Classifier 94.8378 56.5408
Stochastic Gradient Descent 96.9179 96.9471

Table 2: Metric Evaluation of the Classifiers

and y-test house the feature matrix and labels of the testing data, dedicated
to assessing model performance. This segregation ensures that the classifier
is rigorously evaluated on unseen data. Furthermore, by specifying random-
state=0, we establish reproducibility, enabling consistent and replicable results
across multiple runs.

3.2.5 Model Training and Prediction


Firstly, the pipeline-sgd, which encapsulates the text classification pipeline in-
cluding vectorization and classifier components, is fitted to the training data
(X-train and y-train). This operation trains the SGD classifier on the prepared
training dataset, enabling it to learn patterns and associations within the text
data. Following the model training, predictions are generated for the testing
data (X-test) using the trained model. The predict method is applied, produc-
ing y-predict, which comprises the model’s predictions for hate-speech labels on
the testing dataset. This step represents the critical evaluation phase, where
the model’s effectiveness in hate-speech detection is quantified by comparing its
predictions (y-predict) to the actual labels (y-test) from the testing dataset.

4 Results
Scores of the Classifiers
The results indicate that the Stochastic Gradient Descent (SGD) Classifier
outperformed the Support Vector Classifier in terms of both Accuracy and F1
Scores. Specifically, the SGD achieved an accuracy score of 96.9179, which is
notably higher than the Support Vector Classifier’s accuracy score of 94.8378.
Similarly, in terms of the F1 score, the SGD Classifier achieved a significantly
higher score of 96.9471, while the Support Vector Classifier scored 56.5408. This
improvement in performance may be attributed to several factors.
One crucial difference lies in the pre-processing of the text data. The SGD
Classifier implemented a custom text cleaning function, ’clean-text,’ which per-
formed operations such as converting text to lowercase and removing special
characters, mentions, URLs, and other non-alphanumeric characters. In con-
trast, the Support Vector Classifier relied on the ’tweet-preprocessor’ library
for pre-processing, which may not have been as extensive. Notably, the SGD’s
pre-processing included removing mentions and URLs, contributing to cleaner
text data.
Additionally, the SGD Classifier addressed the imbalance in the dataset
by resampling, which involved oversampling the minority class. This step is

271
crucial for dealing with imbalanced data sets and can significantly impact model
performance. In contrast, the Support Vector Classifier did not incorporate such
measures, which might have influenced the differences in model performance.
Another distinguishing factor is the choice of text vectorization. The Sup-
port Vector Classifier employed CountVectorizer() with binary representation,
while the Stochastic Gradient Descent used TfidfVectorizer(). The choice of
vectorizer can influence the representation of text features, with TF-IDF poten-
tially capturing term importance more effectively than CountVectorizer.
Furthermore, both models employed different evaluation metrics. The Sup-
port Vector Classifier relied on the Accuracy score for model evaluation, while
the Stochastic Gradient Descent used the F1 score. The choice of evaluation
metric is essential, and the F1 score, utilized by the SGD Classifier, is particu-
larly suited for imbalanced datasets, which could have contributed to its higher
overall score.
These differences in preprocessing, class imbalance handling, text vectoriza-
tion, and evaluation metrics collectively explain the superior performance of the
Stochastic Gradient Descent Classifier in this study.

5 Conclusion
In this research paper, we conducted an in-depth analysis of two classifiers, the
Support Vector Classifier (SVC) and the Stochastic Gradient Descent (SGD),
to assess their effectiveness and suitability in the context of hate-speech detec-
tion. In the present era, where the proliferation of hate-speech on social media
platforms poses a pressing concern, the need for robust classifiers is paramount.
Our study revealed that both classifiers demonstrated promising results in iden-
tifying hate-speech; however, the Stochastic Gradient Descent (SGD) classifier
exhibited superior performance with an impressive F1 score of 96.96, as opposed
to the Support Vector Classifier’s accuracy score of 94.84.
Several critical factors contributed to this discrepancy in performance. Firstly,
the preprocessing techniques employed by each classifier played a pivotal role.
The SGD classifier utilized an extensive custom text cleaning function, ’clean-
text,’ which not only converted text to lowercase but also effectively removed
mentions, URLs, and various non-alphanumeric characters. Moreover, the SGD
classifier proactively addressed class imbalance through resampling. In stark
contrast, the Support Vector Classifier relied on a pre-made function for pre-
processing and failed to account for class imbalance.
The choice of text vectorization further distinguished the classifiers. The
Support Vector Classifier opted for CountVectorizer() with binary representa-
tion, while the SGD classifier made use of TfidfVectorizer(), a decision that
enhanced its capacity to capture term importance more effectively.
our study showcased the potential of both classifiers in the area of hate-
speech detection. Nonetheless, it is evident that the Stochastic Gradient De-
scent (SGD) classifier, with its comprehensive preprocessing, class imbalance
handling, and advanced vectorization technique, emerged as the more powerful

272
tool for this critical task. Further research and experimentation are needed in
order to refine our understanding of the most effective approach and to continue
addressing the ever-evolving challenge of hate-speech detection in the digital age.

References
[523] Ml: Stochastic gradient descent (sgd). 2023.

[623] Difference between batch gradient descent and stochastic gradient de-
scent. 2023.

[Alv17] Winter F Alvarez, A. Normative change and culture of hate: An


experiment in online environments. . European Sociological Review,
2017.

[Ban22] S. Bansal. A comprehensive guide to understand and implement text


classification in python. Analytics Vidhya, 2022.

[Bot18] Curtis F. E. Nocedal J Bottou, L. Optimization methods for large-scale


machine learning. arXiv.org, 2018.

[Dea12] Corrado G. S. Monga R. Chen K. Devin M. Le Q. V. Mao M. Z. Ran-


zato M. A. Senior A. Tucker P. Yang K. Ng A. Y Dean, J. Large
scale distributed deep networks - neurips. large scale distributed deep
networks. NeurIPS (Conference on Neural Information Processing Sys-
tems), 2012.

[Hui19] P Huilgol. Accuracy vs. f1-score. medium.com, 2019.

[Too] A. (n.d.) Toosi. Twitter sentiment analysis.


[Twi] Twitter. X’s policy on hateful conduct x help. twitter. rules-and-
policies/hateful-conduct-policy.

[Zha18] Z Zhang. Hate speech detection: A solved problem? the challenging


case of long tail on twitter. arXiv, 2018.

10

273
The WHO Is Not The Global Health Government
States Think It Is

Dilay Kuyucak
October 13, 2023

Abstract
With the start of the COVID-19 pandemic, the World Health Or-
ganization was put in the center of the discussion surrounding the global
response. States criticized the organization for its slow and insufficient re-
sponse and leniency towards the Chinese government. This paper argues
that the World Health Organization was unjustly criticized and delegit-
imized for three reasons: (1) unwillingness of member states to cooperate
and the WHO’s lack of authority to ensure compliance; (2) misunder-
standing – by states as well as the WHO itself – of the WHO’s founding
mission and its current role as an international organization; (3) the lack-
ing capacity of states and national healthcare systems to face a pandemic
due to the privatization of health related industries. It suggests that more
authority be given to the organization to ensure accurate and independent
decision-making.

1 Introduction
The World Health Organization (WHO) was first notified of cases of ‘viral pneu-
monia’ originating in Wuhan, China on 31 December 2019. Following the ex-
change of information with the Chinese government and the investigation of the
disease, the WHO declared the novel coronavirus a Public Health Emergency
of International Concern (PHEIC) on 30 January 2020 [WHO20c], and a pan-
demic on March 11 [WHO20a]. From then onward, all eyes turned to the WHO,
and not long afterwards some countries turned against the organization. The
Director General of the WHO, Dr. Tedros was accused of being lenient towards
the Chinese government, as many believed his election was supported heavily
by China. Additionally, as the WHO continued to deny Taiwan a member state
status despite the country’s success during the pandemic, the Taiwanese gov-
ernment claimed that the WHO ignored their concerns about human-to-human
transmission in December 2019 [CC20], and had delayed the global pandemic
response to January 2020 in order to appease the Chinese government. These
∗ Advised by: Dr. David Rezvani of Dartmouth College

274
events resulted in the Trump Administration demanding reform in the organi-
zation’s conduct, and later severing ties with the organization in the height of
the pandemic [Pos20], which was detrimental to the global response overall.
This paper will argue that the World Health Organization is unjustly crit-
icized and delegitimized for three reasons: (1) unwillingness of member states
to cooperate and the WHO’s lack of authority to ensure compliance; (2) mis-
understanding – by states as well as the WHO itself – of the WHO’s founding
mission and its current role as an international organization; (3) the lacking
capacity of states and national healthcare systems to face a pandemic due to
the privatization of health related industries. It will also argue that the efforts
to empower the WHO and create future pandemic plans are futile if states do
not establish strong government organizations and control systems. This paper
will not argue that the WHO’s performance during the COVID-19 pandemic
was satisfactory. It will argue that the circumstances surrounding its failure are
related to its design and operation. The first part of this paper will cover the
need for global health governance, the attempts to discredit the WHO, and its
overall performance during the COVID-19 pandemic. The second will discuss
the limitations of the WHO as an international organization with its role of
meta-governance, the states’ acting in self-interest, and how the design of the
organization impedes its effectiveness during global crises.

2 The World Health Organization During COVID-


19
The COVID-19 pandemic illustrated the importance of international coopera-
tion during a global crisis affecting billions of people around the world. In our
globalized environment where interstate travel is relatively easy and common, a
pandemic poses a threat that concerns everyone at the same time. COVID-19
did not stop at the borders; therefore, the eradication of the disease could only
be achieved through international cooperation and effort. In an ideal world, it
would be in every country’s self-interest to help its neighbor in order to achieve
the ends of eradicating the disease. The World Health Organization, or a sim-
ilar international organization, would coordinate collaboration efforts in order
to effectively combat the virus, while also fulfilling its mission to be a source
of trusted scientific information [BO21]. The scientific community recognized
this need for international cooperation, and scientific institutions shared their
findings and the genome of COVID-19 was open to free access [BT21]. However,
the political actors did not react in the same way, and the situation soon turned
into an international competition of who could secure more medical supply and
vaccines.
The discourse and distrust around the World Health Organization’s practices
and legitimacy was started by the Taiwanese government, and strengthened by
Trump administration who claimed that the WHO was lenient towards China,
had dismissed Taiwan’s concerns of human-to-human transmission on 31 De-

275
cember 2019, and had downplayed the severity of the virus [Don20]. These
claims were further supported by the fact that the WHO had excluded Taiwan
from early emergency meetings in January 2020, and had continued to mis-
report Taiwanese case numbers under China’s data. This resulted in the US
demanding reform and later withdrawing from its member position and cutting
funds. This was significant because the US was the organization’s top donor
and was expected to lead the global pandemic response. Many criticized this
decision, and members of the WHO, the media, and scientists came to the orga-
nization’s defense. The German Foreign Minister echoed this sentiment: “The
decision by US President Donald Trump to end cooperation with the World
Health Organization sends the wrong message at the wrong time. (. . . ) We
need a united response in a spirit of solidarity from all countries and the United
Nations, with a strong WHO at the center.” [Hei20] The irreplaceability of the
WHO was widely accepted; however, some agreed that the claims made by the
former US President were significant. In May 2020 the World Health Assembly
demanded a full independent review of the global response, as well as that of
the WHO [CC20].
Those who agreed with the Former President’s claims that the WHO had
been lenient towards China pointed out the fact that the organization praised
China’s measures early on, congratulating the government’s transparency and
mindfulness towards the outbreak, and that it relied exclusively on data pro-
vided by the Chinese government, ignoring cases reported by Taiwan [WHO20c].
Some also argued that the organization did not want to lose the funds provided
by China. This is not as significant a claim as it seems as most of the WHO’s
funds come from the US, international organizations, and philanthropic foun-
dations; and China’s donations play a minor part [DB19]. Similarly it can be
argued that the WHO’s treatment of China was the result of the International
Health Regulations (IHR) set in 2005, which place the responsibility of accu-
rate reporting of data on member states. These regulations were put in place to
counter uncooperative behavior from states, as China had refused to cooperate
during the 2002-2003 SARS outbreak. However, these new regulations gave lit-
tle to no supranational power to the WHO, and it had to rely on data reported
voluntarily by the states, while having limited authority that was not sufficient
in forcing its members to cooperate. This was a way to ensure state sovereignty
while also providing the WHO with accurate health data. However, since the
data provided is voluntary, it was perhaps in the WHO’s best interest to keep re-
lations with China amicable to ensure continuous flow of information during the
start of the pandemic, when the virus was still a mystery [Mel22]. As a result
of these controversies, the WHO experienced a loss of credibility, with many
turning to private efforts for accurate information, like the Bill and Melinda
Gates Foundation and the Johns Hopkins University’s COVID-19 tracker.
Ill-intentioned or not, the WHO provided guidance and relatively accurate
information during the first stages of the pandemic, despite the uncooperative-
ness of its member states. It took initiative to support the development and
distribution of tests, treatments and vaccines through the Access to COVID-19
Tools (ACT) Accelerator [WHO20b]. Although contributions to the WHO’s

276
budget were rising at the start of the pandemic, in February 2021 the ACT
Accelerator had only gathered 20 percent of its estimated need [BB22]. This
means that states did not contribute the necessary, albeit voluntary, donations
they should have and the WHO did not have any means of extracting these
funds any other way. They also created COVAX to facilitate the allocation of
vaccines, offering free doses to low- and middle-income countries. However, 70
percent of vaccine doses were secured by high and upper middle-income coun-
tries [Irw21]. Canada had reserved more than four vaccines per person, while
Brazil and India had less than one for every two people [SW20]. As a result,
the WHO’s efforts of equal access to COVID-19 tools were undermined by its
member states’ selfish behavior.
The WHO failed to promote solidarity amongst international actors, and
policy decisions were made in order to save the day through temporary con-
tainment measures like lockdowns, rather than to eradicate the problem with
accurate tracing and isolation. Explanation for the WHO’s poor performance
during the COVID-19 pandemic could come from its underfunding, its lack of
authority over states, or false handling of the outbreak. Whatever the case
may be, the recent pandemic has shown that there are fundamental errors in
the operation of the World Health Organization and international cooperation.
However, it cannot be denied that a global health organization is the only way
to combat global health emergencies. The problem is that World Health Or-
ganization is not the first responder rushing to control the disease, as states
would like it to be, but rather the coordinator that promotes best practices to
be followed by states and organizations. States need to realize that in such a
system, domestic responsibility falls upon them.

3 The World Health Organization: Castle Built


On Sand
In order to fully understand why the expectation of states is vastly different from
the WHO’s current operation, the WHO’s historical conduct should be taken
into consideration. When the WHO was first established, it played an essential
part in decolonizing countries’ ‘modernization’. What started out as disease
eradication campaigns turned into building of national health systems under the
pressure of the Soviet Union and Third World countries. While the WHO at the
time tried to fight off this pressure to appease developed countries, the ‘health
for all’ agenda gained prominence and health was framed as a human right in
the 70s. However, with the following neoliberal counter-revolution, the WHO’s
budget was frozen in 1982, and the US withheld 80 percent of its financial
commitment due to the opposition from American pharmaceutical companies to
the WHO’s Essential Drugs Campaign in 1985 [MCF19]. A wave of privatization
of health-related industries followed. With its funding reduced significantly, the
WHO could no longer be an active participant in the issues of global health,
and it adopted the new role of meta-governance, which meant it would provide

277
templates for member states on how to devise their own national policies when
faced with health emergencies. Additionally, the WHO was limited to voluntary
donations, which came from wealthy countries who wanted to ensure their own
health security by making the WHO get involved only in specific cases which
might affect them [Rus11]. The WHO was expected to perform research on the
ground and take initiative during the COVID-19 pandemic, like it had in its early
days; however, states failed to recognize that the organization’s purpose had
evolved into a global coordinator. Most importantly, states failed to recognize
that this change was the result of their own decisions and actions.
Arguably the biggest strike on the WHO’s autonomy was the International
Health Regulations. The 2002-2003 SARS outbreak brought major changes to
the organization due to the Chinese government’s uncooperative actions. The
International Health Regulations’ implementation brought restrictions to the
WHO’s autonomy in fields like research and data collection, while also limiting
its authority over states. The WHO could collect data voluntarily given to it
and warn off powerful states to the impending dangers, but it could not force
the states to follow any guidelines it would provide. It could not shame states,
like it had done to China during the SARS outbreak, and demand that they
cooperate. Since the WHO had no way of imposing sanctions, it had to ensure
the collaboration of states at times of emergency like the COVID-19 pandemic.
These new restrictions also meant that the WHO could not respond to the crises
on the ground, but rather sit at a desk and try to nudge governments in the
right direction. The limitation of WHO’s coordination function due to it being
reliant on states to provide information and the undermining of its leadership
to face states and call out uncooperative behavior cause it to perform poorly
when faced with global crises [Ben20].
Adding onto the existing lack of authority the WHO has, it does not ac-
cept its role as a governing body for international health either. The WHO’s
mission is to be the technical body that provides health guidance and assis-
tance to countries. In other words, it is not a substitute for national health
systems. The WHO emphasizes scientific decision making as it is constituted
of a ‘transnational Hippocratic society’, which leaves out decisions regarding
law and international politics [Fid99]. This implies that the WHO sees itself as
transcending world politics, or at least aims to depoliticize its decisions affect-
ing the member states. The current Director-General of the WHO, Dr. Tedros
Adhanom Ghebreyesus echoed this by saying “my focus is saving lives, we do
not do politics in the WHO [WHO20d].
The WHO not only does not enjoy any political authority, but it does not
want to. However, this is a crucial mistake when confronting a pandemic be-
cause any suggestion they will make, such as travel restriction, face mask use or
national lockdown measures, is inherently political. Additionally, the WHO’s
dependence on member states means that it has to be political in its conduct
if it wants to be able to continue its existence. It can be argued that during
crises expert opinion should transcend politics, however this does not reflect the
current system these organizations operate in. The WHO cannot isolate itself
from the political decisions of its member states, as evidenced by the Trump

278
administration’s retreat and its consequences. In the end, the WHO itself mis-
understands its position as an inherently political institution, and its response to
political criticisms seems hollow. Instead of trying to brush off these criticisms
with emphasis on the importance of empirical data, the WHO should recognize
its political dimension and emphasize the limitations caused by its design.
Powerful states do not want to follow orders. The recent pandemic has shown
a clear hypocrisy in the conduct of powerful Western governments, mainly in
their responses to the COVID-19 pandemic. For example, rather than lead-
ing the world through this crisis the US opted for retreating from the WHO,
blaming its domestic failures on the organization. When the pandemic spread,
governments adopted an ‘every man for himself’ approach, engaging in com-
petitive politics, limiting exports, and hoarding medical supplies. Likewise in
an integrated system like the EU, no European country was willing to donate
medical supplies and resources to a struggling Italy [BO21]. Rational thinking
would suggest international cooperation would be the only solution to a global
problem, but as witnessed during the COVID-19 pandemic, governments do not
want to comply when it is not in their short-term interest. Such an approach
guarantees that long-term solutions are out of reach.
It is also important to remember that the WHO can only do its job of
surveillance and information gathering if governments supply it with the neces-
sary data. This would require countries themselves to have functioning health
monitoring systems, and the capacity to deal with newly emerging problems.
Countries affected severely by the 2002-2003 SARS outbreak were the ones
who were most prepared for the COVID-19 pandemic. Taiwan, for example,
had implemented a nationwide public health network, comprehensive univer-
sal healthcare for all citizens, and improved infection control practices [ea20].
These measures ensured its early response was adequate and Taiwan was one of
the most successful countries when dealing with the pandemic. The states that
seemed to have been most prepared, performed poorly during the pandemic.
They did not have the necessary capabilities to follow any suggestion given
by the WHO, however the organization was still to blame. Taking the UK as
an example, the privatization of the National Health Service’s logistics lead to
massive shortages of key equipment [HLH+ 20]. By mid-February 2020, the UK
could only conduct five COVID-19 tests per week [DM20]. The extensive use of
lockdown measures by the UK was summarized by the UK Scientific Advisory
Group on Emergency as follows: “From a government perspective, lockdown
had big advantages: it did not require any forward planning, there was no need
to build capacity in advance, and no direct financial cost. All lockdown took
was a government decree and a modicum of enforcement. It was a lazy solution
. . . as well as a hugely damaging one. Avoiding lockdown would have required
a lot more effort.” [Woo22]
Taking into consideration all of the above, it is reasonable that the WHO
would not be able to act as a global governing body for health. It has no
power over its member states, on the contrary it is dependent on them for
information and funding. In addition to its lacking authority, it also does not
want to admit its political responsibility as an international organization. This

279
creates catastrophe, especially when we consider the member states will only
cooperate when it is in their self-interest, and surprisingly their self-interest is
not always the health of their peoples. It is also worth noting that in an ever-
globalized world, privatization of industries such as health will lead to shortages
and introduce multinational companies to the debate. During the production
and distribution of COVID-19 vaccines, one of the main struggles was that
these companies controlled when, where, and how much they would produce
the vaccine, as is their right as the owners of the Intellectual Property [PSH20].
When so many actors are in play, blaming the WHO for its pandemic response
seems hypocritical, especially when states have done nothing but underfund the
organization and undermine its authority.

4 Conclusion
The COVID-19 pandemic showed the world that our governing and health sys-
tems are powerless when faced with global threats, both at the national and
global levels. While many countries seemed prepared for such a disaster the
reality proved otherwise. Amidst all of this the World Health Organization, an
international organization primarily focused on scientific processes and narrow
health system improvement projects, was put in the center as a liable author-
ity. States blamed the organization for their own slow and insufficient responses
while also accusing it of being lenient towards the Chinese government. These
were significant allegations directed at an impartial scientific institution. How-
ever, these allegations were unsubstantiated for the three main reasons discussed
in this paper: the WHO and its member states failed to recognize the need for
political consideration in the WHO’s conduct, and as a result claimed that the
organization was not impartial as it claimed to be; states not only withheld data
and financial resources from the organization, but actively undermined its efforts
to provide the necessary response to the pandemic with their greedy race for who
could secure more vaccines; and the states’ own lack of healthcare planning and
capacities, which failed when faced with the pandemic. Such allegations also
disregard the WHO’s limitations as an organization bound by the IHR. It had
to rely on data provided voluntarily and needed to keep relations amicable with
countries in order not to lose its funding and information flow. Amidst all of
this, it had to provide a pandemic blueprint for countries, of which many could
not follow due to their lacking healthcare capacity. Additionally, not every sug-
gestion they made was the correct one, as the situation was very unclear during
the first year of the pandemic. Thus, the WHO became a scapegoat for all the
misguided policies countries decided to follow. Comments made by the US pres-
ident and others damaged the WHO’s credibility and impeded its work further
as countries reduced funding and did not contribute to pandemic efforts such
as the ACT Accelerator and COVAX. Privatization of health-related industries
also hindered states’ ability to provide adequate testing and healthcare.
In the future, for any improvement to come of the COVID-19 pandemic, the
WHO should be given more authority and funding in order for it to make accu-

280
rate and independent decisions. Its dependence on member states is the reason
it has to limit itself to the information provided by them and suggest policies
that are in their interests. If states are willing to sacrifice some sovereignty
in order to have a powerful World Health Organization, future global health
emergencies could be dealt with through strong international cooperation. The
current system is hollow, any suggestion by the WHO cannot be implemented
on the grounds. For this reason, criticizing the WHO for not being the savior
during the pandemic is hypocritical, because the critics are the reason it is not
able to perform adequately. The WHO is not faultless, it should acknowledge
that it has an international responsibility that is inherently political. The WHO
should be recognized as a tool that states put effort into, in order to reap the
benefits when faced with crises.

References
[BB22] Josephine Borghi and Garrett W. Brown. Taking systems thinking
to the global level: Using the who building blocks to describe and
appraise the global health system in relation to covid-19. Global
Policy, 2022.

[Ben20] Eyal Benvenisti. The who—destined to fail?: political cooperation


and the covid-19 pandemic. American Journal of International Law,
2020.

[BO21] Andres Barkil-Oteo. Addressing covid-19 during times of competi-


tive politics and failed institutions. Journal of Global Health, 2021.

[BT21] Forrest RO Stewart LS D’Agostino M Gutierrez EP. Bernardo T,


Sobkowich KE et al. Collaborating in the time of covid-19: The
scope and scale of innovative responses to a global pandemic. JMIR
Public Health Surveill, 2021.

[CC20] Yu-Jie Chen and Jerome A. Cohen. Why does the who exclude
taiwan? Council on Foreign Relations, 2020.

[DB19] Kristina Daugirdas and Gian Luca Burci. Financing the world
health organization: what lessons for multilateralism? Interna-
tional Organizations Law Review 299, 2019.
[DM20] Laura Donnelly and Tom Morgan. Uk abandoned testing be-
cause system “could only cope with five coro-navirus cases
a week”. https://www.telegraph.co.uk/news/2020/05/30/revealed-
test-trace-abandoned-system-could-cope-five-coronavirus, 30 May
2020.
[Don20] President donald j. trump’s letter to dr. tedros adhanom ghebreye-
sus. https://perma.cc/RYW8-XMGC, 2020.

281
[ea20] Cheryl Lin et al. Policy decisions and use of information technology
to fight coronavirus disease, taiwan. Emerging infectious diseases,
2020.

[Fid99] David P. Fidler. International law and global public health. Uni-
versity of Kansas Law Review, 1999.

[Hei20] Maas Heiko. We still need functioning multilateral-


ism in the 21st century. https://www.auswaertiges-
amt.de/en/newsroom/news/maas-who/2346304., 2020.

[HLH+ 20] David Hall, John Lister, Cat Hobbs, Pascale Robinson, and Chris
Jarvis. Privatised and unprepared: the nhs supply chain. Univer-
sity of Greenwich/We Own It, https://weownit.org.uk/privatised-
and-unprepared-nhs-supply-chain, 2020.

[Irw21] A Irwin. What it will take to vaccinate the world against covid-19.
Nature, 2021.

[MCF19] Theodore M. Brown Marcos Cueto and Elizabeth Fee. The world
health organization: A history. Cambridge University Press, 2019.
[Mel22] Margherita Melillo. The uneasy coexistence of expertise and politics
in the world health organization. INTERNATIONAL ORGANIZA-
TIONS LAW REVIEW, 2022.

[Pos20] The Washington Post. Trump administration sends letter


withdrawing us from world health organization over coron-
avirus response. https://www.washingtonpost.com/world/trump-
united-states-withdrawal-world-health-organization-
coronavirus/2020/07/07/ae0a25e4-b550-11ea-9a1d-
d3db1cbe07ces tory.html, 7July2020.

[PSH20] Victoria Pilkington, Mirjam Keestra Sarai, and Andrew Hill. Global
covid-19 vaccine inequity: failures in the first year of distribution
and potential solutions for the future. Frontiers in Public Health,
2020.

[Rus11] Simon Rushton. Global health security: security for whom? security
from what? Political Studies, 2011.

[SW20] A.D. So and J Woo. Reserving coronavirus disease 2019 vaccines


for global access: cross sectional analysis. BMJ, 2020.

[WHO20a] WHO. Who director-general’s opening remarks at the


media briefing on covid-19. https://www.who.int/director-
general/speeches/detail/who-director-general-s-opening-remarks-
at-the-media-briefing-on-covid-19—11-march-2020, 11 March
2020.

282
[WHO20b] WHO. The access to covid-19 tools (act) accelerator.
https://www.who.int/initiatives/act-accelerator, 2020.

[WHO20c] WHO. Who director-general’s statement on ihr emergency commit-


tee on novel coronavirus (2019-ncov). https://www.who.int/director-
general/speeches/detail/who-director-general-s-statement-on-ihr-
emergency-committee-on-novel-coronavirus-(2019-ncov), 30 Jan-
uary 2020.
[WHO20d] WHO. Covid-19 virtual press conference. www.bit.ly/3aNWqoI, 8
April 2020.
[Woo22] Woolhouse. The year the world went mad, 2022.

10

283
Targetting the EGFR Pathway in Glioblastoma
Multiforme: A Review of Current Pre-clinical and
Clinical trials with Tyrosine Kinase Inhibitors.

Ananya Bharathapudi
October 16, 2023

Abstract
With a median overall survival expectancy of 15 months or less [FC17],
Glioblastoma Multiforme (GBM) is the most common type of primary
brain tumor [SA18]. Despite extensive research on the pathophysiology
and clinical course of GBM, the malignancy remains one of the most
lethal cancers to date as the 10 year survival rate is 0.71 percent [TT18].
While established methods of treatment such as resection, radiotherapy,
and chemotherapy are effective in prolonging survival time, they are not
effective in preventing recurrence [HL06] which occurs in almost every
patient [OM14]. To better combat the dismal outcomes of GBM, novel
approaches are necessary given the increase in incidence as well as the
increase in tumor burden globally [GN20]. Gene therapy may serve as a
promising novel therapeutic, with initial clinical studies indicating promis-
ing results [PK05]. This review will outline the most recent treatment
protocols for differing GBM subtypes, characterize the tyrosine kinase
epidermal growth factor receptor (EGFR) and its downstream signaling
pathway, and analyze currently on-going and recently completed clinical
trials involving tyrosine kinase inhibitors in GBM.

1 Introduction
Glioblastoma multiforme (GBM) is the most common primary brain tumor in
adults accounting for over 45 percent of all malignant primary CNS tumors.
The disease occurs in older adults with a median diagnostic age of 64 years
and peak incidence between 75-84 years. Incidence is higher in males than
in females as well as in white, non-Hispanics. GBM remains an incurable tu-
mor, with a median survival time of 15-20 months and 5-year survival rate of
approximately 5 percent due to the heterogeneous and complex nature of the
disease. Approximately 80 percent of GBM tumors are primary, rapidly de-
veloping de novo without precursor lesions such as lower-grade gliomas that
∗ Advised by: Paras Minhas, Stanford University

284
are common in secondary tumors. Of primary GBM tumors, 57 percent con-
tain EGFR gene amplification, encoding the epidermal growth factor receptor
(EGFR). EGFR is a transmembrane receptor tyrosine kinase that contains an
extracellular region composed of four domains and an intracellular region com-
posed of a tyrosine kinase domain as well as C-terminal tail. Upon binding
of the epidermal growth factor ligand, EGFR dimerizes and autophosphory-
lates its C-terminal tail, which serves as a docking site for several secondary
messengers that induce cellular proliferation and resist apoptosis. Prominent
downstream pathways of EGFR include the RAS-RAF-MEK-ERK MAPK as
well as PI3K-AKT-mTOR pathways. Interestingly, approximately 26 percent
of primary GBM tumors contain EGFR activating mutations. The most com-
mon EGFR variant EGFRvIII, occurring in approximately 50 percent of all
EGFR-amplified GBM cases, involves the deletion of amino acids 6-273, encom-
passing exons 2-7. This mutation results in an EGFR that contains a modified
extracellular domain which allows for constitutive activation of the receptor.
Clinically, patients with either increased EGFR expression or mutation are
likely to have increased tumor invasion with lower overall survival rates at 6
months, as compared to the median overall survival rate of 15 months for GBM
patients [BZ18]. Other common mutations in GBM patients include specific
genes that lead to increased development of malignancy, and guide prognosis.
Mutations in isocitrate dehydrogenase 1 and 2 (IDH1 and IDH2) are oncogenic,
promoting methylation in cancers as well as production of oncometabolites such
as 2-hydroxyglutarate (2-HG) [CA13, TZ14]. While the mutations themselves
promote undifferentiated cell proliferation, they are also associated with better
prognosis due to targeting therapies [CA13]. In addition, O6-Methylguanine-
DNA-methyltransferase (MGMT) promoter methylation status. The methyla-
tion of this enzyme promoter makes tumor cells susceptible to DNA damage
caused by alkylating agents, such as temozolomide (TMZ) [HM05]. Activation
of the EGFR receptor leads to homodimerization and autophosphorylation of
several tyrosine residues on the C-terminal domain, eliciting downstream activa-
tion of secondary messengers including protein kinase B (Akt) and mammalian
target of rampamycin (mTOR). Studies have found that the amplification of
EGFR is often seen in tandem with increased abundance and phosphoryla-
tion of pleckstrin homology-like domain family A member proteins (PHLDA1
and PHLDA3), transcription factor SOX9, cell adhesion protein CTNND2 (-
catenin), and cell cycle proteins CDK6 and CDKN2C15. Patients with increased
EGFR expression are likely to have increased tumor invasion with lower overall
survival rates at 6 months, as compared to the median overall survival rate of
15 months for GBM patients [BZ18]. In this review, we will cover the most re-
cent pre-clinical and clinical studies concerning modulation of the EGFR using
tyrosine kinase inhibitors and discuss potential synergistic strategies to possibly
decrease the high tumor burden of GBM.

285
2 Initial Diagnosis
Patients with suspected GBM typically present with progressive neurological
symptoms such as headaches, seizures, and memory loss [BF15]. In patients
with suspected GBM, contrast-enhanced MRI scans are conducted to exam-
ine areas of microvascular proliferation and focal necrosis that may represent
the histological characteristics of the disease [TA20]. Screening for systemic
malignancies are often not necessary when radiographic suspicion is high for
high-grade glioma. Full diagnosis is only achieved upon biopsy, which is col-
lected after maximum tumor resection or, in patients where tumor resection
presents itself to be unamenable, in a biopsy procedure [HM19]. In addition to
scans and tissue pathology, the detection of certain genetic mutations through
fluorescence in situ hybridization (FISH), such as EGFR, may also aid in the
diagnosis of the disease [MC14].

3 Current Treatment Protocols


Treatment of GBM is typically a combined approach involving surgical resec-
tion and adjuvant therapy and can diverge into multiple different approaches
based on several factors, including age. However, clinicians typically start off
with maximum resection surgery, unless this procedure is contraindicated due to
tumor location or patient status [KD11, RC14]. After resection, adjuvant ther-
apy is based on patient age, Karnofsky Performance Status Scale (KPS score),
and methylation status of O6-methylguanine-DNA methyltransferase (MGMT).
Patients 70 years of age, KPS score 60, and methylated MGMT receive radio-
therapy (60 Gray, in 30 fractions) along with daily temozolomide (TMZ) (75
mg/m2/day for 6 weeks), followed by 6 maintenance cycles of TMZ (150–200
mg/m2/day for the first 5 days of a 28-day cycle) [FC17, TA20]. Patients 70
years of age, KPS score ¡60, and methylated MGMT receive hyperfraction-
ated radiotherapy (HFRT) as the preferred line of treatment to reduce toxicity.
HFRT can also be administered along with adjuvant TMZ, to increase efficacy
of overall treatment, but clinicians may also choose to just use TMZ alone, or
simply provide best supportive care [TA20]. Interestingly, recent clinical tri-
als indicate maintenance TMZ may be accompanied by tumor-treating fields
(TTFields), a treatment employing non-invasive delivery of low-intensity (1–3
V/cm), intermediate-frequency (100–300 kHz), alternating electric fields [DA13]
that target polymerization and depolymerization of microtubules in the mitotic
spindle [FD19]. This combination has been shown to increase overall survival
and disease-free progression [?] across multiple clinical trials, with one resulting
in patients who had completed initial radiotherapy and TTFields plus TMZ hav-
ing median progression-free survival of 6.7 months, as compared to 4.0 months
in TMZ-alone group [GG19]. If the patient is ¿70 years of age, KPS score 60,
and methylated MGMT, then HFRT, along with TMZ, is given (dosage depen-
dant on number of fractions, and TMZ over the course of radiation), followed
by maintenance TMZ. A second option is the use of standard radiotherapy com-

286
bined with TMZ, followed by maintenance TMZ and TTFields [TA20]. If the
patient has poor functional status and a KPS ¡ 60, then HRFT alone, or TMZ
alone, is given. Patients who contain unmethylated MGMT are generally re-
sistant to TMZ adjuvant therapy [AI20]. In such cases, standard radiotherapy
is administered, given the patient has a KPS score 60 [TA20]. At tumor re-
currence, the most preferred line of therapy is surgery, as research has shown
that reoperation improves overall survival1, though there is no standard line of
adjuvant treatment for recurring tumors [TA20]. Re-radiation, with a median
total dose of 30–36 Gy, may be an alternative treatment option [WM13] how-
ever, it is not as highly recommended as surgery or systemic therapy, due to
potential for increased toxicity [TA20,O.15]. Systemic therapy involves adminis-
tering chemotherapeutic as well as immunotherapeutic agents such as TMZ and
bevacizumab as well as alkylating agents like carmustine or other blood-brain
barrier (BBB) penetrant nitrosoureas. Unfortunately, systemic therapy dur-
ing tumor recurrence though results of studies testing the effectiveness of such
drugs with recurrence have been discouraging [O.15]. The attending physician
typically chooses the treatment method based on several factors including the
patient’s KPS score, tumor burden, methylation status of MGMT, epidermal
growth factor receptor (EGFR) status, and IDH status. TTFs may also be
used, though studies have shown that majority of patients still do not survive
for over two years, which is why supportive care may present itself as the best
option, as it emphasizes improving quality of life and managing discomforting
symptoms [FC17, TA20].

4 Recent Tyrosine Kinase Inhibitors


4.1 CM93
CM93, a novel covalent-bonding TKI, has shown to contain comparable efficacy
to Osimertinib (IC50 3.66nM vs. 12.03nM, respectively) in several cancer cell
lines harboring EGFR mutations [ea20]. Furthermore, CM93 displays a 20-fold
greater brain-to-plasma ratio at estimated steady states. In addition, CM93
reduced 293-EGFRvIII cell viability with an IC50 of 1.48 M which was lower
than Erlotinib (IC50 4.83 M), gefitinib (IC50 15.67M) and Osimertinib (IC50
2.19 M). In vitro, CM93 reduced EGFRvIII phosphorylation in two tyrosine
kinase sites in HEK293-EGFRvIII cells. Further titration revealed that CM93
had an IC50 value of 0.19 M on EGFRvIII phosphorylation [Ni21]. In mice,
CM93 had comparable efficacy to Osimertinib; both had significantly inhibited
tumor growth with a 25mg/kg dose. With a 10mg/kg and 30mg/kg dose of
both CM93 and Osimertinib tumor count significantly reduced tumor cell count
with no statistically significant difference between the two drugs. In an NSCLC
brain metastasis model, mice were given CM93 at 25mg/kg and 50mg/kg and
Osimertinib at 25mg/kg. The median survival time of mice taking 25mg/kg of
CM93 was 80 days, and mice taking 50mg/kg had a median survival time of 100
days. Mice taking Osimertinib reached an endpoint after four weeks due to body

287
weight loss and skin lesions when dosed at 25mg/kg. They had a brain to plasma
drug concentration ratio was 6:1 in males and 7:1 in females; whereas mice tak-
ing CM93 had a brain to plasma drug concentration ratio of 14:1 in males and
15:1 in females, suggesting CM93’s ability to penetrate through the blood brain
barrier and, therefore, showing efficacy in the brain [ea20]. Another preclinical
trial demonstrates similar results comparing CM93 to Gefitnib, another EGFR
TKI. After a pilot comparative assessment seven hours after a single dose of
30mg/kg of CM93 or 50mg/kg of Gefitinib was administered, CM93 had a kp
value of 28.3; whereas, Gefitinib had a kp value of 0.55 [Ni21]. Unlike Osimer-
tinib, CM93 had little adverse effect on mouse skin; with Osimertinib, mice
lost more than 20 percent of their body weight reaching their endpoint and had
severe hair loss after three weeks. Mice treated with CM93, however, showed
no hair loss; this demonstrates CM93’s potential to improve patient quality of
life [ea20]. Another preclinical trial further examined CM93’s efficacy in vivo
using genetically engineered mice with GBM. Mice taking CM93 had a medium
survival of 33 days while the control group had a medium survival of 25.5 days.
In this model too, there was no significant hair or loss of body weight observed
in the CM93 Group [Ni21].

4.1.1 ERAS-801
Currently in phase one and in a nonrandomized sequential open label designed
study format, the next trial includes patients with a diagnosis of GBM IDH
wildtype. Patients with prior EGFR inhibitor treatment for GBM are ex-
cluded. This clinical trial’s intervention is ERAS-801, a new EGFR Tyro-
sine kinase inhibitor ERAS-801 targets the RAS/MAPK pathway and inhibits
EGFR [ERAb]. Targeting wildtype EGFR and mutant variants of EGFR by
small molecules and antibodies has been shown to improve patient outcomes
in NSCLC, CRC, and HNSCC; however, in CNS tumors the ability to target
wtEGFR and mutant EGFR remains an unmet need. The two main reasons
why current EGFR inhibitors lack efficacy is their lack of ability to penetrate
the blood brain barrier and are week inhibitors of EGFRvIII mutant protein.
ERAS-801, however, differs as it is designed to be selective, reversible, orally
available, and has a 3:7 brain to plasma ratio in mice demonstrating CNS pene-
trability. ERAS-801 is also able to target EGFR alterations such as EGFRvIII
. When a single oral dose of 10mg/kg of ERAS-801 was administered to mice,
ERAS-801’s kp value was 3.7, which was higher than Osimertinib’s (0.99) , Afa-
tinib’s (0.25) , Erlotinib’s (0.06) , Gefitinib’s (0.36) , and Dacomitinib’s (0.61)
; all of the other named drugs are other EGFR TKI’s. Taken together the ev-
idence suggests that ERAS-801 out performs other inhibitors in terms of CNS
penetration. In preclinical studies, ERAS-801 showed efficacy against EGFR
through an IC50 of 0.3nM and high selectivity for EGFR based on a biochem-
ical screen of 484 kinases where ERAS-801 at 10 µM inhibited two non EGFR
family kinases at greater than 90 percent. In vitro cell based assays, ERAS-
801 had an IC50 of 1.1nM against wildtype EGFR an IC50 of 0.7 nM against
EGFRvIII, and an IC50 value of less than 3 µM in a 31 patient derived glioma

288
cell panel where 65 percent of glioma cell growth was inhibited by ERAS-801.
The patient derived glioma cell panel had the most common types of EGFR
alterations which include amplification, EGFRvIII, extracellular domain muta-
tions, and chromosome 7 polysomy. ERAS-801 also showed no activity against
astrocytes, the most common cell in the human brain. This suggests that ERAS-
801 selectively inhibits EGFR without disturbing normal brain cells that were
not dependent on EGFR signaling. In vivo, ERAS-801’s high CNS penetra-
tion resulted in survival benefit. In an EGFRvIII mutant patient-derived GBM
model, the medium survival time was 40 days for the control group, between 60
and 70 days for the 10mg/kg dose of ERAS-801, around 80 days for the 25mg/kg
of ERAS-801 group, and around 80 days for the 75mg/kg of ERAS-801 group.
In four additional patient-derived glioma models that harbor EGFRvIII, EGFR
amplified, or chromosome 7 polysomy mutations, ERAS-801 showed TGI in 93
percent of 14 patient derived models. Taken together, the evidence suggests
ERAS-801’s efficacy in combatting GBM [ERAa].

4.1.2 AZD9291 (Osimertinib)


The next clinical trial follows a single group assignment format and is in phase
II. Including those who have supratentorial contrast enhancing progressive or
recurrent tumors and an EGFR mutation or amplification and excluding those
with p53 mutations and prior exposure to EGFR targeted treatments, the trial
tests Osimertinib, also known as AZD9291. Osimertinib is a small molecule TKI
inhibitor, antineoplastic agent used in therapy of selected forms of NSCLC. Its
common side effects include diarrhea, rash and dry skin, and nail toxicity. Its se-
vere but uncommon side effects include interstitial lung disease, prolongation of
QTC interval, and cardiomyopathy [Osia]. Evidence from three preclinical stud-
ies show the drug’s promise and its limitations. In athymic mice, Osimertinib
showed CNS penetration with a concentration of 3,695 ± 425 nM Osimertinib
in the brain compared to 314nM of the drug found in plasma, giving Osimer-
tinib a brain to plasma ratio greater than 10. In vitro, when Osimertinib’s
efficacy is tested against D317 cells which express high levels of EGFRvIII, Os-
imertinib inhibited EGFR phosphorylation at an IC50 of 50nM. Although there
was no effect on the total level of EGFR, Osimertinib leads to a blockade of
EGFRvIII’s intracellular signaling. In vitro, the quantification of Osimertinib’s
inhibition of D317 cells’ growth using WST-1 cell proliferation assay led to an
IC50 of 476 ± 163 nM, indicating Osimertinib’s ability to inhibit EGFRvIII+
growth at concentrations attainable in the brain [Cha20]. Another preclini-
cal trial demonstrates its efficacy by comparing osimertinib to six other EGFR
inhibitors in 22 patient-derived GBM cell samples. Osimeritinib showed effi-
cacy in 10 of the 22 samples tested with a 50 percent growth inhibition at the
concentration of three micro moles [Pet16]. A different preclinical trial demon-
strated that Osimeritinb can inhibit wild-type EGFR with weaker binding than
that of T790 mutant EGFR with IC50 values of 184 and 1 nanomoles respec-
tively [LX19]. In GBM cell lines, Osimertinib inhibited the growth of six cell
lines in a dose dependent manner with IC50 values ranging from 1.25 to 3 mi-

289
cromoles; first generation EGFR inhibitors had IC50 values of 10 micromoles in
the same setting, suggesting that Osimertinib has greater efficacy compared to
first generation EGFR TKIs [Pet16]. To examine whether Osimertinib inhibits
GBM cell growth due to off-target effect, two U87 cell lines stably express-
ing wild-type or Cys797 mutant EGFR were constructed to reveal that Cys797
residue in the catalytic domain of EGFR is key to the inhibitory effect of Os-
imertinib. While treatment significantly inhibited growth of cells expressing
wild-type EGFR, effects on growth were nearly abolished with Cys797 mutant
EGFR. EdU-positive assay to evaluate Osimeritinib’s inhibitory effect on GBM
proliferation showed that proliferation in U87 and U251 lines were reduced to
25.59 percent and 37.37 percent respectively, suggesting Osimertinib’s strong in-
hibition of GBM cell proliferation in a dose dependent manner. Furthermore, a
colony formation assay revealed that the number of colonies as reduced 767.82
percent by Osimertinib, and a Methylcellulose colony confirmed these results
suggesting Osimertinib’s ability to significantly inhibit GBM cell colony forma-
tion. Flow cytometry also revealed that Osimertinib’s mechanism of GBM cell
proliferation inhibition, was that the cell cycle distribution and progression was
arrested in in G1 phase in both cell types tested in the assay (u87 and U251).
Western blot analysis to test inhibition of the EGFR/ERK pathway activation
in which Osimertinib’s effect was tested on EGFR, AKT, STAT3, and ERK
phosphorylation in GBM cells. Different concentration of Osimertinib treat-
ment U87 and U251 GBM cells tested had no significant changes in total EGFR
expression; however, phosphorylated EGFR numbers gradually reduced with
increasing Osimertinib concentrations which also lowered the level of ERK and
had no effect on AKT and Stat3 level. In erlotinib, a well known TKI, inhib-
ited ERK phosphorylation for 24-48 hours after which ERK reactivation was
observed. Osimertinib, on the other hand, can continuously suppress EGFR
and ERK phosphorylation and may therefore inhibit the growth of GBM cell
continuously by blocking the EGFR/ERK pathway. Also, when Osimertinib is
combined with ERK inhibitor PD098059, anti-proliferations and anti-invasion
activities of Osimertinib are enhanced. Results from EdU assays show that both
Osimertinib and PD098059 inhibited the proliferation of GBM cells; however,
compared to the monotherapies, the combination was observed to be more effec-
tive. PD098059 also enhanced the inhibitory effect of Osimertinib on GBM cell
invasion. Combined with another ERK inhibitor SCH772984, however, Osimer-
tinib showed effects on proliferation of GBM cells but not on cell invasion. This
data suggests that ERK inhibition could increase the sensitivity of GBM cells
to Osimertinib [LX19]. In vivo, orthotopic and heterotopic mice models, tumor
growth in the Osimertinib treated group was slower with a T/C of 0.0241, which
is significant because any value less than 0.4 is considered significant inhibition.
Osimertinib was effective in slowing the growth of intracranial tumors and the
median survival of untreated mice, 26 days, was increased to 42 days in the
treated mice [Cha20]. Another preclinical trial used in situ GBM nude mice
models treated with an intraperitoneal injection and an oral administration of
osimertinib to observe that immunofluorescence stainin of GBm sections in the
Osimertinib treatment group were significantly higher than those in the control

290
group, suggesting that Osimertinib inhibited proliferation and promoted GBm
cell apoptosis in vivo [LX19]. A completed clinical trial including patients with
IDH1 or IDH2 wildtype GBM involved paitents taking 80mg of Osimertinib
orally once a day until unacceptable side effects, death, or medical complica-
tions occurred. Four out of the six patients were assessed for response. Out of
four patients, one showed partial response, two had received stable disease, and
the last was refractory to treatment. Transient improvement in imaging was not
without side effects: two patients had Thrombocytopenia, one developed grade 1
diarrhea and pneumonia, and the other developed grade one mucositis [Abo10].
Because Osimertinib penetrated the blood brain barrier effectively, had in vitro
and in vivo data to support its efficacy, and inhibits multiple intracellular path-
ways, it may be a better treatment option than previously tested EGFR-TKI’s
for GBM patients. Osimeritinib is also irreversible and can lead to prolonged
survival and continuous ERK inhibition. Results show that the combination of
an EGFR inhibitor and an AKT/STAT3 pathway may be more effective than
a monotherapy [LX19]. The clinical trial also shows that Osimeritnib may ben-
efit select patients with recurring MG and EGFR alterations underscoring the
importance of characterizing EGFR alterations before considering Osimertinib
treatment for a certain patient [Abo10].

4.2 BDTX-1535
The next ongoing clinical trial investigates the potential of BDTX-1535 monother-
apy. Currently in phase I, the trial’s includes patients diagnosed with wild-
type IDH GBM and astrocytoma with molecular features of GBM; both must
be recurrent cancers. Its exclusion criteria include known resistant mutations
in tumor tissue or ctDNA, prior treatment with EGFR inhibitors, and brain
metastases or spinal cord compression requiring intervention. BDTX-1535, the
intervention, is selective, highly potent, and an irreversible inhibiter of EGFR
alterations including amplification, mutations, and splice variants seen in GBM.
A report summarizes more information about the drug and some key preclinical
trials that offer some descriptions of BDTX-1535. If BDTX-1535 could over-
come Osimertinib resistance, it could address a pressing rising need in EGFR
mutant non-small lung cell cancer. BDTX is optimized against a broad spec-
trum of EGFR mutations and a Goldilocks wild type selectivity profile. Results
have shown that in mice harboring NSCLC with C797S mutation, BDTX-1535
induced a dose dependent tumor shrinkage without a loss of body weight. The
mice treated with Osimertinib, however, looked like the untreated control group.
BDTX-1535 could penetrate the blood brain barrier addressing brain metastases
and CNS tumors [BDT21].

4.3 HMPL-813 (Epitinib Succinate)


Epitinib has the potential to cross the brain-blood barrier and display its ef-
fectiveness in brain metastasis tumors. Another phase I clinical trial involving
epitinib in patients with non-small-cell lung cancer has been conducted with

291
72 patients enrolled, all of which had EGFR-mutant advanced non-small-cell
lung cancer with brain metastases. Patients were given 120mg or 160 mg orally
with safety and tolerability being the primary outcomes. Treatment related
toxicities occurred in 13 (43.3 percent) of the patients in the 120 mg group and
21(50 percent) of the patients in the 160mg group. The drug had an objective
response rate of 53.6 percent in 120 mg group and 40.5 percent in the 160 mg
group. The median duration of response was 7.4 and 9.1 months in the 120
and 160 mg groups respectively, while the median progression-free survival was
7.4 months for both groups. Taken together, the data suggests epitinib in 160
mg showed promising efficacy and was well tolerable; this was also taken as the
recommended phase II dose [ea22]. Another clinical trial testing the safety of
Epitinib in patients with EGFRm+ NSCLC recruited 36 patients in a dose es-
calation phase at 7 dose levels up to 240mg starting at 20 mg. Dose escalation
was followed by a 3+3 design. The most common adverse effects seen were:
rashes which occurred in 60 percent, diarrhea (34.2 percent), elevated AST(34.3
percent), and hyperbilirubinemia (28.6 percent). Drug exposure increased pro-
portionally until it plateaued at 160 mg and above. Out of 12 patients treated
with 160 mg of eptitinib, 5 all reached PR and showed tumor shrinkage. 2 pro-
gression events, in the liver and brain, were observed. With this evidence taken
together, further development of this drug was supported [ZQ16].

4.4 Anlotinib
Currently in phase II, anlotinib is a multitarget TKI that blocks the migration
and proliferation of endothelial cells, reduces the tumor microvascular density
by targeting VEGFRs, FGFRs, and PDGFRs [Anl]. A preclinical trial at-
tempting to test if Osimertinib overcomes acquired resistance to EGFR TKI’s
in patients with EGFR mutant non-small cell lung cancer was conducted. The
researchers evaluated the antitumor effects of gefitinib + anlotinib in gefitinib
resistant lung adenocarcinoma models in vitro and in vivo and investigated the
treatment of an EGFR TKI + Anlotinib in 24 patients with advanced EGFR
mutant NSCLC after EGFR TKI acquired resistance. The results show that
Anlotinib reversed gefitinib resistance adenocarcinoma models by enhancing
antiproliferative and proapoptotic effects of gefitinib. Similarly, EGFR-TKI+
Anlotinib therapy showed an objective response rate of 20.8 percent and a dis-
ease control rate of 95.8 percent. While median progression free survival was
11.53 plus of minus 2.41 months, overall median survival could not be reach. In
the clinical trial, one adverse event in grade 3 was noted, but there were not
grade 4 or 5 adverse events. The researchers conclude by stating that EGFR
TKI + Anlotinib demonstrates powerful antitumor activity in vitro and in vivo.
Using anlotinib can overcome resistance to EGFR-TKI in advanced EGFR mu-
tant NSCLC patients [Zha21]. Another preclinical trial examined the effects
of anlotinib with temozolomide and the molecular mechanisms of anlotinib in
Glioblastoma. Through a Cell Counting Kit-8 and colony forming assays, the
researchers examined cell viability. Cells treated with anlotinib in 0, 1.25, 2.5,
5, 10, and 20 micro moles were tested to reveal that anlotinib could induce cell

292
death when concentrated and in a dose dependent manner in all GBM cell lines
tested. To see long term effects, the researchers used colony formation assay
and found that the size of independent colonies in anlotinib treated group were
much smaller and were significantly reduced, indicating that anlotinib inhibited
the proliferation of GBM cells in a dose dependent manner. Then the migratory
ability of GBM cells was tested through wound healing. The migratory ability
of GBM cells compared to untreated control cells was decreased by anlotinib.
Following that, Transwell migration and Matrigel invasion assays revealed that
GBM cell migration and invasion capacities were reduced when treated with
anlotinib, so anlotinib suppressed the migration and invasion of glioblastoma
cells in a concentration-dependent manner. Then flow cytometry was used to
analyze anlotinib treatment’s effect on the cell cycle profile. After pretreatment
with 0, 2, and 4 micromoles of anlotinib for 24 hours the percentage of cells in
the G2/M phase increased in a dose dependent manner suggesting that anlotinib
could induce a G2/M phase arrest [XP22]. Since previous studies have indicated
that arresting the cell cycle initiates an apoptotic program, anlotinib’s effect was
examined to reveal that the percentage of apoptotic cells was elevated in three
human GBM cell lines. Compared to the cell group, anlotinib was able to in-
duce apoptosis. Researchers also observed that anlotinib induced autophagy
related proteins according to western blotting suggesting that anlotinib started
autophagic programs in GBM. JAK2/STAT3 signaling pathways plays a key role
in angiogenesis; VEGFA, which anlotinib has also known to target, is a down-
stream target gene of JAK2/STAT3 which promotes angiogenesis. A tubular
formation assay was performed to evaluate anlotinib’s effects on new capillaries
sprouting. The human umbilical endothelial tumor formation was inhibited by
u87/anlotinib supernatant which was enhanced by S31-201. Because VEGFA
plays a crucial role in tumor angiogenesis and anlotinib was able to decrease
VEGFA levels secreted by U87 cells, the researchers decided to further explore
underlying molecular mechanisms in GBM cell treatment with anlotinib. Af-
ter a western blot analysis, the researchers found several key signaling pathway
proteins, and after Anlotinib treatment, cell motility related proteins and pro-
liferation related protein expression decreased after 2 micromoles of treatment
which was later enhanced by S31-202 in 100 micromoles. These findings showed
that anlotinib’s influence on the JAK2/STAT/VEGFA signaling pathway could
affect its influence on the anti-angiogenic and anti-glioblastoma effects in GBM.
When put together with temozolomide, a wound-healing assay showed that the
combination of the drugs increased the cell migration inhibition compared to
each drug used alone. Flow cytometry was used to test whether the enhanced cy-
totoxicity was due to cellular apoptosis, but the drugs alone increased apoptosis
with greater efficacy than the combination of drugs [XP22]. Changes to compo-
nents of the JAK2/STAT3/VEGFA signaling pathway were assessed to reveal
that the combination of drugs were more effective than either drug alone to sup-
press JAK2/STAT3/VEGFA signaling. The researchers proceeded to perform
in vivo, nude mice, bioluminescence imaging every seven days suggesting that
anlotinib delayed tumor growth compared to the control group. Staining also
revealed that anlotinib reduced the positivity of the proliferation index. Western

10

293
blotting further revealed that anlotinib reduced p-JAK2, p-STAT3, and VEGFA
in vivo, indicating that anlotinib was able to inhibit proliferation in vivo. The re-
searchers conclude that because anlotinib can suppress proliferation, migration,
invasion and angiogenesis of GBM cells in a dose-dependent manner, anlotinib
offers promise. Furthermore, its cooperative effect with temozolomide to further
enhanced cytotoxicity and anti-angiogenesis offers only stronger evidence of its
promise. While the previous trial did characterize anlotinib in terms of a VEGF
inhibitor, the next trial examines anlotinib combined with cranial radiotherapy
to address cancer patients with brain metastasis. By analyzing the clinical ef-
fects of anlotinib + Cranial Radiotherapy (CRT) versus CRT alone in NSCLC
patients with brain metastasis, the researchers found no significant clinical fea-
tures between the two groups of patients where 45 received CRT alone and 28
received CRT + anlotinib. The researchers also analyzed the overall survival of
anlotinib + CRT compared to CRT alone. After evaluating clinical character-
istics to establish a baseline, prognostic factor for intracranial progression free
survival and overall survival underwent univariate and multivariate analysis.
Compared to the CRT group, the combined group had greater median intracra-
nial progression-free survival of 3 months and 11 months respectively; however,
there were no significant differences in overall survival, extracranial progression
free survival, and systemic progression free survival. Univariate and multivari-
ate analysis further revealed that the addition of anlotinib to treatment was an
independent advantage predictor while an age greater than 57 years and a KPS
score less than or equivalent to 80 were independent disadvantage predictors
of overall survival [He21]. While the difference was not statistically significant,
those with anlotinib and Local CRT treatment had the longest intracranial Pro-
gression free survival of 27 months and overall survival of 36 months, and the mi
progression free survival and m overall survival values for the local CRT group
had values of 11 months and 18 months respectively for shorter values of the
brain. The research concludes by saying that anlotinib can improve intracranial
lesion control and survival prognosis of NSCLC patients with CRT [He21].

5 Conclusion
With its comparable efficacy to Osimertinib in T790M mutations (ic50 4.39nM),
CM93 offers the most promise out of all the other drugs listed above. Although
its inhibition of wt-EGFR (ic50 3300nM) is lacking, it is a selective inhibitor
of EGFR and effectively inhibits EGFRvIII (IC50 0.19 mu moles), the most
common EGFR mutation. CM93’s higher median survival of mice and high
brain-to-plasma concentration suggest potentially improved prognosis and effi-
cacy in patients. The mice’s lack of skin lesions and body weight loss suggests
improved quality of life for patients and its ability to be tolerated in higher doses
gives makes this drug a promising drug for the future. Epitinib offers the least
promise of the drugs listed. Despite its efficacy and ability to penetrate the BBB,
its toxicity and adverse side effects in patients (rashes, diarrhea, elevated AST,
hyperbilirubinemia) suggest its limited effectiveness. The two progression cases

11

294
in the liver and brain observed in the clinical trial evaluating Epitinib lowers the
drug’s promise as it adds a risk factor to the drug. The scarcity of preclinical in-
formation available about this drug also puts limits its promise as it comes with
many unknowns. After CM93, BDTX-1535 and WSD0922-FU offer promise in
terms of improving patient quality of life. BDTX-1535 reported no body weight
loss in vivo and WSD0922-FU reported no dose-related toxicities in vivo stud-
ies. Both show potential to overcome resistance to widely used Tyrosine Kinase
inhibitors (Oismeritibinib for BDTX-1535 and Cetuximab for WSD0922-FU).
WSD0922’s low IC50 values for EGFRm and EGFRvIII inhibition, show its
promise to inhibit different types of EGFR mutations while BDTX’s inhibi-
tion of various EGFR mutations irreversibly offers similar promise. Both have
the ability to penetrate the blood-brain barrier and increase the median survival
time in vivo. With similar efficacy and safety profiles, the lack of information re-
garding both drugs introduces many unknowns giving it less promise than CM93
which not only offers more specific reduced negative effects toxicities on the mice
but also specific inhibition values for various EGFR mutations/variants. With
similar efficacy to CM93, ERASS-801 shows great potential to penetrate the
BBB and inhibit EGFR with low IC50 values (1.1 nM against wild-type and
EGFR, 0.3 nM against EGFRvIII) suggesting strong efficacy. Its selectivity and
lack of interference with astrocytes suggest fewer negative effects or impacts on
the other parts of the brain. Its efficacy and selectivity, while offering promise,
do not mention the effects or potential toxicities on patients placing it below
CM93 in terms of the promise. Similar to ERAS-801, Anlotinib, while showing
strong efficacy with its potential to arrest the G2/M phase in cells, inhibit in
vivo proliferation, and 11.53 months survival progression time shows no evi-
dence of potential to improve patient quality of life. Its high median survival
time, suggests improvements in prognosis; however, if Anlotinib, like Epitinib,
comes with strong dose-related toxicities, it is likely that those toxicities may
inhibit or hinder improvements in a patient’s condition, limiting its promise. Os-
imeritib, while offering strong efficacy through its high kb value (greater than
10) and its low IC50 values (184nM for wt-EGFR, 1nM for t790M mutations,
and 1.25-3 micromoles in GBM cell lines), shows limited promise despite its
ability to increase the median survival time of mice by 16 days. Osimeritnib’s
toxic side effects and severe side taken with the results from the clinical trial
evaluating the drug’s effects on four patients suggest that the drug’s toxicities
could potentially inhibit/hinder treatment/recovery. Its negative effects lower
patient quality of life while drugs such as CM93 show the potential to increase
patient quality of life. Taken together, the preclinical/clinical profiles of these
EGFR Tyrosine Kinase inhibitors suggest that CM93 shows the most promise
followed by BDTX-1535 and WSD0922-FU, ERAS-801, and Anlotinib. Epitinb
and Osimeritnib, while efficacious, lower patient quality of life, giving them less
promise.

12

295
Figure 1: Enter Caption

References
[AA14] O’Neill E. Abraham AG. Pi3k/akt-mediated regulation of p53 in
cancer. Biochem Soc Trans, 2014.

[Abo10] M. et al. Abousad. Clinical experience using osimertinib in patients


with recurrent malignant gliomas containing egfr alterations. J Can-
cer Sci Clin Ther, 2010.

[AI20] Rayi A et al. Alnahhas I, Alsawas M. Characterizing benefit


from temozolomide in mgmt promoter unmethylated and methylated
glioblastoma: a systematic review and meta-analysis. Neuro-Oncology
Adv, 2020.

[Anl] Anlotinib combined with dose-dense temozolomide for the first re-
current or progressive glioblastoma after stupp regimen. clinicaltri-
als.gov.

[AS19] Alzaharani AS. Pi3k/akt/mtor inhibitors in cancer: At the bench


and bedside. Semin Cancer Biol, 2019.

[BDT21] Bdtx-1535 goes after osimertinib resistance. Cancer Discov, 2021.

[BF15] Grant R Klein M. Boele FW, Rooney AG. Psychiatric symptoms in


glioma patients: from diagnosis to management. Neuropsychiatr Dis
Treat, 2015.

[BJ17] D’Antonio M et al. Benitez JA, Ma J. Pten regulates glioblastoma


oncogenesis through chromatin-associated complexes of daxx and hi-
stone h3.3. Nat Commun, 2017.

[BZ18] Bakas S et al. Binder ZA, Thorne AH. Epidermal growth factor
receptor extracellular domain mutations in glioblastoma present op-
portunities for clinical imaging and therapeutic development. Cancer
Cell, 2018.

[CA13] Colman H. Cohen AL, Holmen SL. Idh1 and idh2 mutations in
gliomas. Curr Neurol Nuerosci Rep, 2013.

[CB19] Alexander-Bryant AA. Caffrey B, Lee JS. Vectors for glioblastoma


gene therapy: Viral non-viral delivery strategies. Nanometer, 2019.

13

296
[CD09] Atkins MB. Cho D, Mier JW. Pi3k/akt/mtor pathway: A growth and
proliferation pathway. in: Bukowski rm, figlin ra, motzer rj, eds. renal
cell carcinoma: Molecular targets and clinical applications. Humana
Press, 2009.
[CE12] Dogrusoz U et al. Cerami E, Gao J. The cbio cancer genomics por-
tal: an open platform for exploring multidimensional cancer genomics
data. Cancer Discov, 2012.
[Cha20] G. et al. Chagoya. Efficacy of osimeritinib against egfrviii+ glioblas-
toma. Oncotarget, 2020.
[CI98] Vaillancourt MT et al Cheney IW, Johnson DE. Suppression
of tumorigenicity of glioblastoma cells by adenovirus-mediated
mmac1/pten gene transfer. Cancer Res, 1998.
[CS16] Arcaro A. Crepo S, Kind M. The role of the pi3k/akt/mtor pathway
in brain tumor metastasis. J cancer Metastasis Treat, 2016.
[CSY18] Huang C-C Huang E-Y. Chou S-Y, Yen S-L. Galecting-1 is a poor
prognostic factor in patients with glioblastoma multiforme after ra-
diotherapy. BMC Cancer, 2018.
[DA13] Palti Y. Davies AM, Weinberg U. Tumor treating fields: a new fron-
tier in cancer therapy. Ann N Y Acad Sci, 2013.
[DF15] Lemaire L Benoit J-P Lagrace F. Danhier F, Messaoudi K. Combined
anti-galectin-1 and anti-egfr sirna-loaded chitosan-lipid nanocapsules
decrease temozolomide resistance in glioblastoma: in vivo evaluation.
Int J Pharm, 2015.
[ea20] Wang Q. et al. Cm93, a novel covalent small molecule inhibitor tar-
geting lung cancer with mutant egfr. bioRxiv, 2020.
[ea22] Zhou Q. et al. Safety and efficacy of epitinib for egfr-mutant non-
small cell lung cancer with brain metastases: Open-label multicentre
dose-expansion phase ib study. Clin Lung Cancer, 2022.
[ER04] Buzzai M et al. Elstrom RL, Baur DE. Akt stimulates aerobic gly-
colysis in cancer cells. Cancer, 2004.
[ERAa] 10-k. sec.gov.
[ERAb] A study to evaluate eras-801 in patients with recurrent glioblastoma.
Clinical Trials.gov.
[FC17] Osorio L et. al Fernandes C, Costa A. Current standards of care in
glioblastoma therapy. Codon Publications, 2017.
[FD17] Hopkins BD Bagrodia S-Cantley LC Abraham RT. Fruman DA,
Chiu H. The pi3k pathway in human disease. Cell, 2017.

14

297
[FD19] Alanhhas I-et al. Fabian D, Guillermo Prieto Eibl MD. Treatment of
glioblastoma (gbm) with the addition of tumor-treating fields (ttf):
A review. Cancers, 2019.

[FQW13] Gustafson WC et al. Fan Q-W, Cheng CK. Egfr phosphorylates


tumor-derived egfrviii driving stat3/5 and progression in glioblas-
toma. Cancer Cell, 2013.

[GG19] Stieber VW Wang BCM Garrison LPJ. Guzauskas GF, Pollom EL.
Tumor treating fields and maintenance temozolomide for newly diag-
nosed glioblastoma: a cost-effectiveness study. J Med Econ, 2019.
[GN20] Mizzi S Meilak L Calleja N Zrinzo A Grech N, Dalli T. Rising
incidence of glioblastoma multiforme in a well-defined population.
Cureus, 2020.

[He21] Z. et al He. Anlotinib combined with cranial radiotherapy for non-


small cell lung cancer patients with brain metastasis: A retrospec-
tively control study. Cancer Manag Res, 2021.
[HL06] Hsu AR Tse VCK Huo LC, Veeravagu A. Recurrent glioblastoma
multiforme: a review of natural management options. Neurosurg Fo-
cus, 2006.
[HM00] Bötefür IC Holland JF Ohnuma T. Halasatch ME, Schmidt U.
Marked inhibition of glioblastoma target cell tumorigenicity in
vitro by retrovirus-mediated transfer of a hairpin ribozyme against
deletion-mutant epidermal growth factor receptor messenger rna. J
Neurosurg, 2000.
[HM05] Gorlia T et al. Hegi ME, Diserens A-C. Mgmt gene silencing and
benefit from temozolomide in glioblastoma. N Engl J Med, 2005.

[HM16] Ballon D et al. Hicks MJ, Chiuchiolo MJ. Anti-epidermal growth


factor receptor gene therapy for glioblastoma. PLos One, 2016.
[HM19] Solyom EF Grant R. Hart MG, Grant GR. Biopsy versus resection
for high-grade glioma. Chocrane database Syst Rev, 2019.
[K11] Shah K. In vivo imaging of the dynamics of different variants of egfr
in glioblastomas. Mol Biol, 2011.

[KC10] Chekenya M Krakstad C. Survival signalling and apoptosis resistance


in glioblastomas: opportunities for targeted therapeutics. Mol Can-
cer, 2010.

[KD11] Ganslandt O Bauer M Buchfelder M Nimsky C Kuhnt D, Becker A.


Correlation of extent of tumor volume resection and patient survival
in surgery of glioblastoma multiforme with high-field intraoperative
mri guidance. Neuro Oncol, 2011.

15

298
[KE12] Twigger K et a. Karapanagitou EM, Roulstone V. Phase i/ii trial
of carboplatin and paclitaxel chemotherapy in combination with in-
travenous oncolytic reovirus in patients with advanced malignancies.
Clin Cancer Res, 2012.

[KJ11] Maity A. Karar J. Pi3k/akt/mtor pathway in angiogenesis. Front


Mol Neurosci, 2011.

[LE19] Huang LE. Friend or foe-idh1 mutations in glioma 10 years on. Car-
cinogenesis, 2019.

[LX19] et al. Liu X. The third-geenration egfr inhibitor azd9291 overcomes


primary resistance by continuously blocking erk signaling in glioblas-
toma. J Exp Clin Cancer Res, 2019.

[MB07] Cantley LC. Manning BD. Akt/pkb signaling: navigating down-


stream. Cell, 2007.

[MC14] Ligon KL. Maire Cl. Molecular pathologic diagnosis of epidermal


growth factor receptor. Neuro Oncol, 2014.

[MR18] Smithberger E et al. McNeill RS, Stoorbant EE. Pik3ca missense


mutations promote glioblastoma pathogenesis, but do not enhance
targeted pi3k inhibition. PLoS One, 2018.

[N.04] Hay N Sonenberg N. Upstream and downstream of mtor. Genes Dev,


2004.

[Ni21] J. et al. Ni. Targeting egfr in glioblastoma with a novel brain-


penetrant small molecule egfr-tki. bioRxiv, 2021.

[NR94] Harmon RC et al. Nishikawa R, Ji XD. A mutant epidermal growth


factor receptor common in human glioma confers enhanced tumori-
genicity. Proc Natl Acad Sci USA, 1994.

[O.15] Gallego O. Nonsurgical treatment of recurrent glioblastoma. Curr


Oncol, 2015.

[OM03] Blessing T Kircheis R Wolscheck M Wagner E/ Ogris M, Walker G.


Tumor-targeted gene therapy: strategies for the preparation of ligand-
polyethylene glycol-polyethyleneimine/dna complexes. J Control Re-
lease, 2003.

[OM14] Synder LA et al. Oppenlander ME, Wolf AB. An extent of resec-


tion threshold for recurrent glioblastoma and its risk for neurological
morbidity. J Neurosurg, 2014.

[Osia] 18f-fdg pet and osimertinib in evaluating glucose utilization in pa-


tients with egfr activated recurrent glioblastoma. clinicaltrials.gov.

16

299
[Osib] 18f-fdg pet and osimertinib in evaluating glucose utilization in pa-
tients with egfr activated recurrent glioblastomad. ClinicalTrials.gov.
[Pet16] Ballard Peter. Preclinical comparison of osimertinib with other egfr-
tkis in egfr-mutant nsclc brain metastases models, and early evidence
of clinical brain metastases activity. American Association for Cancer
Research, 2016.
[PK05] Yla-Herttuala S. Pulkkanen KJ. Gene therapy for malignant glioma:
current clinical status. Mol Ther, 2005.
[QQ13] Liu X et al. Qi Q, He K. Disrupting the p1ke-a/akt interaction inhibits
glioblastoma cell survival, migration, invasion and colony formation.
Oncogene, 2013.
[RC14] Ebner FH et al Roder C, Bisdas S. Maximizing the extent of resection
and survival benefit of patients in glioblastoma surgery: high-field
imri versus conventional and 5-ala assisted surgery. Eur J Surg Oncol
J Eur Soc Surg Oncol Br Assoc Surg Oncol, 2014.
[RR16] Kolarovszki B Richterová R. Genetic alterations of glioblastoma. in:
Agrawal a, ed. neurooncology. InTechOpen, 2016.
[SA05] Wagner e Levitzki A Shir A, Ogris M. Egf receptor-targeted synthetic
double-stranded rna elminates glioblastoma, breast cancer, and ade-
nocarcinoma tumors in mice. PLOS Med, 2005.
[SA13] Karsy M. Sami A. Targeting the pi3k/akt/mtor signaling pathway in
glioblastoma: novel therapeutic agents and advances in understand-
ing. Tumor Biol, 2013.
[SA18] Luesakul U Muangsin N Neamati N. Shergalis A, Bankhead A 3rd.
Current challenges and opportunities in treating glioblastoma. Phar-
macol Rev, 2018.
[SF18] Assi HI. Saadeh FS, Mahfouz R. Egfr as a clinical marker in glioblas-
tomas and other gliomas. Int J Biol Markers, 2018.
[SO05] Heid I et al. Saydam O, Glauser DL. Herpes simplex virus 1 amplicon
vector-mediated sirna targeting epidernmal growth factor receptor
inhibits growth of human glioma cells in vivo. Mol Ther, 2005.
[SR17] Kanner A et al. Stupp R, Taillibert S. Effect of tumor-treating fields
plus maintenance temozolomide vs. maintenance temozolomide alone
on survival in patients with glioblastoma: A randomized clinical trial.
JAMA, 2017.
[SY01] Hirose Y et al. Sonada Y, Ozawa T. Formation of intracranial tu-
mors by genetically modified human astrocytes defines four pathways
critical in the development of human anaplastic astrocytoma. Cancer
Res, 2001.

17

300
[TA20] Lopez GY Malinzak M Friedman HS Khrasaw M Tan AC, Ashley DM.
Management of glioblastoma: State of the art and future directions.
CA Cancer J Clin, 2020.

[TT18] Eltayeb M Tykocki T. Ten-year survival in glioblastoma. a systematic


review. J Clin Neurosci Off J Neurosurg Soc Australas, 2018.

[TZ14] Das S. Turkalp Z, Karamchandani J. Idh mutation in glioma: New


insights and promises for the future. JAMA Nuerol, 2014.

[WC11] Crooks D Wilkins S Jenkinson MD. Walker C, Barobie A. Biology,


genetics and imaging of glial cell tumours. Br J Radiol, 2011.

[WG11] Binder ZA Gallia GL Riggins GJ. Weber GL, Parat M-O. Abrogation
of pi3kca or pik3r1 reduces proliferation, migration, and invasion in
glioblastoma multiforme cells. Oncotarget, 2011.

[WLB21] Gritsenko MA et al. Wang L-B, Karpova A. Proteogenomic and


metabolomic characterization of human glioblastoma. Cancer Cell,
2021.

[WM13] Perry JR Wick W. Weller M, Cloughesy T. Standards for care for


treatment of recurrent glioblastoma-are we there yet? Neuro Oncol,
2013.

[WP12] Reardon DA Ligon KL Alfred Yung WK Wen PY, Lee EQ. Cur-
rent clinical development of pi3k pathway inhibitors in glioblastoma.
Nuero oncol, 2012.
[WS97] Li J et al. Wang SI, Puc J. Somatic mutations of pten in glioblastoma
multiforme. Cancer Res, 1997.
[XP22] Pan H Chen J Deng C. Xu P, Wang J. Anlotinib combined with temo-
zolomide suppresses glioblastoma growth via mediation of jak2/stat3
signaling pathway. Cancer Chemother Pharmacol, 2022.
[Zha21] C. et al. Zhang. Concurrent use of anlotinib overcomes acquired re-
sistance to egfr-tki in patients with advanced egfr-mutnat non-small
cell lung cancer. Thorac Cancer, 2021.

[ZQ16] Yuan L Hua Y Wu Y-L. Zhou Q, Gan B. The safety profile of a


selective egfr tki epitinib (hmpl-813) in patients with advanced solid
tumors and preliminary clinical efficacy in egfrm+ nsclc patients with
brain metastasis. 2016.

[ZZ18] Uang J Wang Z Du G. Zhang Z, Yao L. Pi3k/akt and hif-1 signaling


pathway in hypoxia-ischemia (review). Mol Med Rep, 2018.

18

301
Detecting Distributed Denial Of Service Attacks
(DDoS) Using Machine Learning Models
∗†
Isha Singhal
October 12, 2023

Abstract
The digital landscape of today’s world is vulnerable to the widespread
threat of Distributed Denial of Service (DDoS) attacks. These attacks
have the potential to seriously damage businesses’ finances and reputa-
tions by interfering with the availability of internet services. Traditional
methods of DDoS mitigation, such as rule-based approaches, struggle to
keep up with the evolving nature of attacks. In this paper, I have trained
and tested several supervised machine learning algorithms for the identi-
fication of DDoS attacks to determine the most effective one. I explore
the depths of DDoS, obtaining and adjusting a dataset-utilizing principal
component analysis (PCA) to reduce the number of features in the model
from 80 to 20 while preserving 90% variance in our dataset. By reducing
unnecessary features, PCA allowed us to have higher model accuracy and
training speed. Overall, the Random Forest model trained with PCA had
the best results, obtaining 99.9% accuracy, precision, and recall. The pro-
posed approach exhibits encouraging results, demonstrating its potential
to improve DDoS attack detection and thus reinforce network security.

1 Introduction
Distributed Denial of Service (DDoS) attacks involve overloading a target sys-
tem with an excessive amount of traffic, preventing it from responding to valid
user requests. These types of attacks are executed from multiple computers or
machines, making them both harder to detect and put an end to. They ex-
ploit the fundamental need for the availability of online services and can lead
to severe operational, financial, and reputational consequences. Modern DDoS
attacks are sophisticated and large-scale, and conventional protection methods
like firewalls and intrusion detection systems often fall short in handling them.
The attackers are continuously changing their methods to bypass the defense
∗ Advised by: Dr. Maria Konte
† Student at Northville High School in Northville, Michigan

302
mechanisms put in place to prevent DDoS attacks and researchers, in turn,
change their approach to prevent new attacks.
There are several reasons why DDoS attacks are difficult to defend against.
One reason is because of the scale and volume of the attack. These attacks
involve a large volume of traffic that exceeds the target’s capacity, making it
difficult to filter out attack traffic from real traffic. The scale of attacks can go
all the way up to hundreds of gigabits per second.
Another reason is that DDoS attacks are launched from a multitude of
sources, usually through the use of botnets. Botnets are networks of comput-
ers infected with malware and are under the control of an attacker, making it
difficult to identify and block the attacking sources effectively.
DDoS attacks often imitate normal user behavior, making it difficult to figure
out whether the traffic is legitimate or not. Thus, it becomes hard to filter out
malicious traffic without blocking real users. Lastly, DDoS attackers find and
exploit weaknesses in network protocols, infrastructures, or applications, further
amplifying the detrimental effects of their attacks.
Machine learning techniques have emerged as a potential solution to detect
and prevent DDoS attacks. This is due to their ability to adapt to evolving
attack patterns. I have attempted to use various supervised machine learning
algorithms to detect DDoS attacks in this paper. Using a dataset with benign
and DDoS attacks, I have trained and tested these models and then analyzed the
results to figure out which model is most effective in determining and identifying
DDoS attacks. The ultimate goal is to improve the detection of DDoS attacks
in real-world real-time systems, where intrusion detection systems can be proac-
tively utilized with the appropriate model for the detection and mitigation of
these attacks.

Figure 1: DDoS Attack [Clo23b]

303
2 Background
2.1 Types of DDoS Attacks
There are several kinds of DDoS Attacks [Imp23]. The most common are listed
below:

2.1.1 Volumetric Attacks


This kind of DDoS attack floods the target’s network with a large amount of
traffic, consuming the available bandwidth and overwhelming the network.

2.1.2 Protocol Attacks


Another kind is a protocol attack, which exploits vulnerabilities in network
protocols, such as TCP/IP, DNS, or ICMP, to disrupt the target’s services or
infrastructure.

2.1.3 Application Layer Attacks


Application layer DDoS attacks focus on certain applications or services and
exploit vulnerabilities in the application layer to use up server resources or
disrupt the functionality of the application.

2.1.4 Reflective Attacks


Lastly, reflective attacks are when the hacker spoofs the IP address of the source
to make requests to servers that will respond with larger replies to the target’s
IP address, thus amplifying the attack traffic and making it harder to fight
against.

2.2 What Harm Do DDoS Attacks Cause


The primary harm DDoS attacks cause is disrupting the availability of services
or websites. By overwhelming the target’s resources, actual users are unable to
access the service or experience slow response times.
Businesses can experience financial loss from downtime due to DDoS attacks,
especially if they rely heavily on online operations for their revenue. In the
long term, frequent DDoS attacks can tarnish businesses’ reputations, causing
customers to lose trust in their ability to provide reliable services. For example,
in June 2023, Microsoft’s office suite—including Outlook and OneDrive—and
cloud computing platform were hit by DDoS attacks, leading to serious service
disruptions.

304
3 Dataset
3.1 What Does the Dataset Include
This dataset [Tal23], which shares its feature set with IDS2017, IDS2018, DoS2017,
and DDoS 2019 CIC NIDS datasets, includes data from various kinds of DDoS
attacks, such as DrDoS, UDP, LDAP, NetBIOS, MSSQL, and many others. It
lists 80 features from over 400,000 DDoS attacks in order to provide a large
dataset.
Using pandas, the data analysis library of Python, I read the dataset into a
pandas data frame with its read csv() method. In this function, index col is an
optional parameter that specifies the column(s) to be used as the index of the
resulting data frame. By default, index col is set to None, meaning that a new
index will be created for the data frame.

Figure 2: Reading Dataset Into Pandas Dataframe

Figure 3: Listing of Top Five Rows

3.2 Analyzing the Dataset


This dataset originally has 80 columns (features). Some of the important fea-
tures and their significance is explained [Naj23] below:
Originally, the dataset contained 80 columns, including the flow duration,
total forward and backward packets, and the average packet length. In this
dataset, the label column is called “Class”, whose value is either “Attack” or
“Benign” and the remaining columns are the features. Some of the important
features and their significance is explained below:

3.2.1 Forward packet length mean (fw pkt l avg):


DDoS attacks generate a significantly higher volume of network traffic than
normal traffic. By calculating the mean forward packet length, you can identify
abnormal patterns where the average packet size deviates from the expected
behavior seen in normal traffic.

305
3.2.2 Forward segment size average (fwd seg avg):
Some DDoS attacks involve packet fragmentation, where attackers split their
payloads into smaller segments to evade detection. Analyzing the forward seg-
ment size average can help identify unusually small or fragmented packets that
might be part of such attacks.

3.2.3 Initial forward window bytes (fw win byt):


The Initial Forward Window Bytes metric refers to the total number of bytes
sent from the client to the server during the initial connection establishment.
Unusually large values can be indicative of an attack, as DDoS attacks often
involve rapid and massive connection attempts to overwhelm the target server
or network. Some DDoS attacks, such as SYN flood attacks, involve flooding
the target server with a large number of connection requests. Monitoring the
number of bytes sent in the initial window in the forward direction can help
identify such flooding patterns and differentiate them from legitimate connection
attempts.

306
3.2.4 Initial backward window bytes (bw win byt):
Some DDoS attacks involve asymmetric communication, where the attacker
sends large amounts of data to the server during the initial connection phase to
consume server resources and establish connections without intending to com-
plete them. Monitoring the Initial Backward Window Bytes in tangent with
the Initial Forward Window Bytes can help detect asymmetric communication
attempts.

3.2.5 Forward segment size min (fw seg min):


DDoS attacks often generate traffic with distinct characteristics compared to
regular network traffic. A significant number of small segments can be abnormal
behavior in the context of legitimate connections, making it a potential indicator
of malicious activity.

307
3.3 Various Types of Attacks in the Dataset
I plotted the counts of various types of attacks (the column is called “Label”)
present in this dataset vs their names to see the spread of various DDoS attacks
using the matplotlib Python library.

Figure 4: Plotting Types of DDoS Attacks Against Their Count

3.4 Checking Imbalance in the Dataset


The next step I took involved checking for imbalances in the dataset. This
dataset is imbalanced towards malicious traffic with the ratio being 3:1 for
Attack as compared to Benign as shown in the plot below.

308
Figure 5: Plotting Attack Versus Benign

4 Preparing the Dataset for Learning


Before machine learning, I dropped the unwanted feature called “Label” which
simply classified the type of attack each row belonged to.

Figure 6: Deleting Unwanted Feature

4.1 Label Encoding


Before the machine learning models can be used to train the dataset, the labels
need to be encoded. Label encoding is used to convert categorical (A string
variable consisting of only a few different values) columns into numerical ones
so that they can be fitted by machine learning models that only take numerical
data. It is an important data pre-processing step in a machine-learning project.
Here the label called “Class” is converted from having the values “Attack”
to 0 and “Benign” to 1 by using the LabelEncoder class from the scikit-learn
library. Then I separated the data into the features and labels and put them
into different arrays, called X and y respectively.

309
Figure 7: Label Encoding [Chu23]

Figure 8: Columns in the Data Frame

4.2 Random Undersampling of Data


Since this dataset is imbalanced, this bias in the training dataset can influence
the machine learning algorithms, leading them to ignore the minority class en-
tirely. In our case, this would lead to ignoring benign data, which would lead
to an incorrect model. In order to mitigate this problem, I randomly resampled
the training dataset. The method I used is called random undersampling which
deletes some data from the majority category, leaving me with 97,831 rows, for
both the benign and DDoS types.

310
Figure 9: Random Undersampling of Data [Bro21]

4.3 Scaling the Data


The dataset contains features of various dimensions and scales together. Differ-
ent scales of the features affect the modeling of a dataset adversely, which leads
to a biased outcome of predictions. So, it is necessary to scale the data before
modeling. To accomplish this, I used standardization, which is a scaling method
making the data scale-free by resizing distribution of values so that mean of ob-
served values is 0 and standard deviation is 1. I have used StandardScaler from
scikit-learn to do this on the randomly undersampled features array.

Figure 10: Scaling of Data [Kha21b]

4.4 Performing PCA on Data


Many datasets, like the one I am using for this paper, have many variables
or features, known as high-dimensional data. This can lead to computational
inefficiencies and difficulties in visualization. To mitigate this issue, I performed
Principal Component Analysis, also known as PCA. It can reduce the number
of dimensions while retaining the most important information, making data
more manageable and interpretable. This makes it an important tool for data
analysis, feature selection, and enhancing the performance of machine learning
algorithms. I performed PCA on the scaled dataset containing features using
the PCA class from the scikit-learn library.

Figure 11: Performing Principal Component Analysis on the Data [SK23a]

10

311
I have tried retaining 90% of the variance in data using the PCA and deter-
mined the minimum number of features which would allow me to keep 90% of
variance in data.

Figure 12: Retaining 90% Variance on the Data [SK23a]

A common method for determining the number of features to be retained is


a graphical representation known as a scree plot. A Scree Plot is a simple line
segment plot that shows the eigenvalues for each individual feature or principal
component. For this dataset, using the scree plot, the number of features to
keep in order to retain 90% of existing variance in data comes out to be 20 as
shown below:

Figure 13

11

312
Figure 14: Graphing a Scree Plot [SK23b]

Figure 15

4.5 Splitting the Data


Now, the data must be split into training and testing data; the training data
is used to fit the machine learning model so it can differentiate a DDoS attack
from benign, while the testing data is used to evaluate the accuracy of the fit
machine learning model. I called the train test split method of sklearn to split
the data into training and testing datasets. I decided on a 70/30 train/test split
because it gives enough data to train and test our model and ensures accuracy.

12

313
Figure 16: Splitting the Data [Bro20]

5 Processing the Dataset


I used five machine learning models to train the dataset. The models used
included Logistic Regression, Random Forest, K-Nearest Neighbors, AdaBoost
and Decision Tree.
I created a common method called fit and score which takes each of the above
five initialized models and uses the scikit-learn method called “fit” to train each
machine learning model on the training dataset. I also used the method called
cross val score from the scikit-learn package which trains and tests a model over
multiple folds of your dataset (here number of folds is 10). This cross-validation
method provides a better understanding of model performance over the whole
dataset instead of just a single train/test split [All22].
Then I used the roc auc score function from the sklearn.metrics module that
calculates the area under the receiver operating characteristic (ROC) curve for
a model on the testing data. The ROC curve is a graphical representation of
the performance of a model as its discrimination threshold is varied. The AUC
(area under the curve) of the ROC curve is a measure of how well the model
can distinguish between the positive and negative classes. The y test stands for
the real values of y. Comparing them with the predicted ones helps determine
accuracy of the model [Clo23a].

Figure 17: Training the Dataset Using Various Models

13

314
The salient features and why the above five machine learning models were
selected for DDoS identification are detailed below.

5.1 Logistic Regression


Logistic Regression provides interpretable results, as it estimates the relation-
ship between the input features and the probability of a DDoS attack occurring.
This transparency can be valuable for understanding the impact of different fea-
tures on the classification decision.
Logistic Regression is computationally efficient and relatively fast to train
and predict. It can handle large datasets without requiring significant compu-
tational resources.
Due to its simplicity and efficiency, Logistic Regression can scale well to
large-scale DDoS detection scenarios.

5.2 Random Forest


Random Forest is an ensemble learning technique that combines multiple de-
cision trees, reducing the risk of overfitting and improving generalization on
unseen data. This property can be beneficial in handling the diverse and com-
plex patterns often found in DDoS attacks.
Random Forest can rank the importance of features in the dataset, pro-
viding insights into which features contribute most to the distinction between
normal traffic and DDoS attacks. This information can be valuable for feature
engineering and selecting relevant features.

5.3 KNN
DDoS attacks can exhibit non-linear patterns in network traffic data. KNN is
a non-parametric algorithm, which means it can capture complex, non-linear
relationships between features and target classes, making it potentially effective
in detecting such patterns.
KNN can be used for anomaly detection since it classifies data points based
on the majority class of their k-nearest neighbors. In the context of DDoS
detection, attacks are often considered anomalies compared to normal network
traffic, and KNN can help identify these anomalies.
KNN is relatively easy to implement and understand. It does not require a
complex training process, making it suitable for quick prototyping and imple-
mentation.

5.4 AdaBoost
DDoS attacks can exhibit complex and non-linear patterns in network traffic
data. AdaBoost can combine multiple weak classifiers, typically decision trees,
to create a more powerful model capable of capturing complex relationships in
the data.

14

315
AdaBoost assigns higher weights to informative features during training,
which can lead to better feature selection and focus on the most relevant features
for DDoS detection.
Similar to other ensemble methods, AdaBoost is adaptive and can adapt to
changes in the data distribution. This makes it suitable for handling dynamic
DDoS attack patterns.

5.5 Decision Tree


Decision trees can implicitly rank the importance of features in the dataset.
For DDoS attack detection, certain features or traffic characteristics might have
higher relevance in identifying attacks. Decision trees can identify these features
at the top levels of the tree, making it more efficient.
DDoS attacks often exhibit complex and non-linear relationships among var-
ious features, like traffic volume and packet rates. Decision trees can capture
these non-linear relationships by recursively partitioning the feature space into
regions that better differentiate between normal and attack traffic.
Using the common fit and score method, I calculated the accuracy as well
as ROC-AUC scores for each of the five supervised machine learning algorithms
mentioned above.

5.6 Calculating Accuracy


Accuracy is a metric that shows how a model performs across all classes. It
is useful when all classes are of equal importance. It is calculated as the ratio
between the number of correct predictions to the total number of predictions.

Figure 18: Formula for Calculating Accuracy [Doc23]

Figure 19: Accuracy Values for Various Models

15

316
5.7 Calculating ROC-AUC Score
The ROC-AUC is the area under the ROC curve. It is the metric that is used
to measure how well the model can distinguish two classes. The score ranges
from 0.5 to 1, and the score being 1 is the ideal case where TPR (true positive
rate) is 1 and FPR (false positive rate) is 0, which means I correctly classify all
positives and negatives.

Figure 20: ROC-AUC Scores for Various Models

Comparing the ROC-AUC scores for the five models leads us to the real-
ization that the better option among these classifiers is Random Forest even
though all of them perform well. Bar graphs comparing their accuracies and
ROC-AUC scores are shown below:

Figure 21: Plot of Accuracy Values for Various Models

16

317
Figure 22: Plot of ROC-AUC Scores for Various Models

5.8 Plotting ROC Curve


To investigate further, I used a ROC curve to evaluate the performance of all
five models.

Figure 23: Components of ROC Curve [Sch23]

ROC curves for the five machine learning models are displayed below. They
are all equally good in terms of the area under the ROC curve which is almost
1 for all of them.

17

318
Figure 24: ROC Curve for Logistic Regression

Figure 25: ROC Curve for Random Forest

Figure 26: ROC Curve for KNN

18

319
Figure 27: ROC Curve for AdaBoost

Figure 28: ROC Curve for Decision Tree

19

320
5.9 Creating Confusion Matrix
Furthermore, I checked the effectiveness of the models using a confusion matrix.
It is a table with 4 different combinations of predicted and actual values.

Figure 29: Confusion Matrix [Nar21]

The lower the values in the false positive and false negative blocks, the better
the effectiveness of the model. The higher the values in the true positive and
true negative blocks, the better the effectiveness of the model.

Figure 30: Confusion Matrix for the Logistic Regression Model

20

321
Figure 31: Confusion Matrix for the Random Forest Model

Figure 32: Confusion Matrix for the KNN Model

21

322
Figure 33: Confusion Matrix for the AdaBoost Model

Figure 34: Confusion Matrix for the Decision Tree Model

22

323
5.10 Creating Classification Report
I also evaluated the performance of these models using the classification report.
A classification report is a performance evaluation metric in machine learning.
It is used to show the precision, recall, F1 Score, and support of the trained
classification model.
The different metrics of a classification report are described as shown below:

Figure 35: Components of Classification Report [Kha21a]

Precision is the true positive divided by the sum of the true positive and
false positive.
Precision = TruePositives / (TruePositives + FalsePositives)
Recall is the true positive divided by the sum of the true positive and false
negative.
Recall = TruePositives / (TruePositives + FalseNegatives)
F1 score is two times the recall times the precision over the sum of the recall
and precision.
F1 score = 2 * (precision * recall) / (precision + recall)
The classification reports for the five supervised machine learning models
are:

23

324
Figure 36: Classification Report for Logistic Regression Model

Figure 37: Classification Report for Random Forest Model

Figure 38: Classification Report for KNN Model

24

325
Figure 39: Classification Report for AdaBoost Model

Figure 40: Classification Report for Decision Tree Model

The closer the values of precision, recall, and f1-score to 1, the better the
effectiveness of the model.
Finally, I evaluated the performance of all five supervised machine learning
models using the precision recall curve.

25

326
Figure 41: Precision Recall Curve for the Logistic Regression Model

Figure 42: Precision Recall Curve for the Random Forest Model

26

327
Figure 43: Precision Recall Curve for the KNN Model

Figure 44: Precision Recall Curve for the AdaBoost Model

27

328
Figure 45: Precision Recall Curve for the Decision Tree Model

6 Conclusion and Further Analysis


I achieved 99% accuracy for all of the supervised machine learning models.
Overall, I recommend Random Forest over Decision Tree based on the confu-
sion matrix made from our dataset. As shown in Figures 18-45 above, I utilized
various performance indicators to compare the models, including confusion ma-
trices, precision recall curves, ROC curves, and ROC-AUC scores. The three
most commonly used performance indicators are accuracy, recall, and precision.
F1 score provided additional information on the model performance.
It must be noted that this study was conducted on various types of DDoS
attacks currently known. In the future, if new types of DDoS attacks emerge,
this analysis will have to be performed again as the new types may not be
detected that well using Random Forest. Additionally, my research shows that
supervised learning models are effective in identifying DDoS attacks, but they
need pre-labeled datasets and training, which is unavailable for not-yet-known
attacks.
Due to new types of DDoS attacks occurring every day, there are large differ-
ences between known and lab-based train datasets and real-time DDoS attacks.
This causes the false-negative rate to be higher than anticipated. Further re-
search is needed on accurately detecting DDoS attacks under real scenarios and
different types of test datasets.
Aside from conducting research to find the limitations of existing methods
and datasets, I propose that the focus should also be on developing new types
of DDoS attacks to anticipate potential future attacks.

28

329
References
[All22] Stephen Allwright. Using cross val score in sklearn, simply explained.
https://stephenallwright.com/cross_val_score-sklearn/,
2022.

[Bro20] Jason Brownlee. Train-test split for evaluating machine learn-


ing algorithms. https://machinelearningmastery.com/
train-test-split-for-evaluating-machine-learning-algorithms,
2020.

[Bro21] Jason Brownlee. Random oversampling and undersampling for


imbalanced classification. https://machinelearningmastery.com/
random-oversampling-and-undersampling-for-imbalanced-classification/,
2021.

[Cho23] Jean-Christopher Chouinard. How to use classification re-


port in scikit-learn (python). https://www.jcchouinard.com/
classification-report-in-scikit-learn/, 2023.

[Chu23] Aakarsha Chugh. Label encoding in


python. https://www.geeksforgeeks.org/
ml-label-encoding-of-datasets-in-python/, 2023.

[Clo23a] Saturn Cloud. How to improve your model’s performance


with sklearn roc auc score. https://saturncloud.io/blog/
how-to-improve-your-models-performance-with-sklearn-rocaucscore,
2023.

[Clo23b] Cloudflare. What is a distributed denial-of-service (ddos)


attack? https://www.cloudflare.com/learning/ddos/
what-is-a-ddos-attack/, 2023.

[Clo23c] Cloudflare. What is an application layer ddos attack? https://www.


netscout.com/what-is-ddos/application-layer-attacks, 2023.
[Doc23] Hasty.Ai Documentation. Accuracy. https://hasty.ai/docs/
mp-wiki/metrics/accuracy, 2023.
[Goo23a] Google. Classification: Roc curve and auc — machine learn-
ing — google for developers. https://developers.google.
com/machine-learning/crash-course/classification/
roc-and-auc#:~:text=An%20ROC%20curve%20, 2023.

[Goo23b] Google. Classification: Roc curve and auc — machine learning


— google for developers. https://developers.google.com/
machine-learning/crash-course/classification/roc-and-auc,
2023.

29

330
[Hui23] Purva Huilgol. Precision and recall: Essential metrics for machine
learning (2023 update). https://www.analyticsvidhya.com/blog/
2020/09/precision-recall-machine-learning, 2023.
[Imp23] Imperva. Ddos attack types & mitigation methods: Imperva. https:
//www.imperva.com/learn/ddos/ddos-attacks, 2023.
[Kha21a] Aman Kharwal. Classification report in machine learning:
Aman kharwal. https://thecleverprogrammer.com/2021/07/07/
classification-report-in-machine-learning, 2021.
[Kha21b] Aman Kharwal. Standardscaler in machine learning: Aman
kharwal. https://thecleverprogrammer.com/2020/09/22/
standardscaler-in-machine-learning, 2021.
[Man20] Sanchita Mangale. Scree plot. https://sanchitamangale12.
medium.com/scree-plot-733ed72c8608, 2020.
[Mik19] Bartosz Mikulski. Pca-how to choose the number of
components? https://www.mikulskibartosz.name/
pca-how-to-choose-the-number-of-components, 2019.
[Naj23] et al Najafimehr, Mohammad. Ddos attacks and machine-learning-
based detection methods: A survey and taxonomy. https://
onlinelibrary.wiley.com/doi/full/10.1002/eng2.12697, 2023.
[Nar21] Sarang Narkhede. Understanding confusion
matrix. https://towardsdatascience.com/
understanding-confusion-matrix-a9ad42dcfd62, 2021.
[Net23] Palo Alto Networks. What is a denial of service at-
tack (dos)? https://www.paloaltonetworks.com/cyberpedia/
what-is-a-denial-of-service-attack-dos, 2023.
[One23] OneLogin. What is a ddos attack: Types, prevention & remediation.
https://www.onelogin.com/learn/ddos-attack, 2023.
[Pan22] Pankaj. Numpy.cumsum() in python. https://www.digitalocean.
com/community/tutorials/numpy-cumsum-in-python, 2022.
[Sch23] Frank Schoonjans. Roc curve analysis. https://www.medcalc.org/
manual/roc-curves.php, 2023.
[SK23a] Paula Villasante Soriano and Cansu Kebabci. Principal com-
ponent analysis (pca) in python: Sklearn example. https://
statisticsglobe.com/principal-component-analysis-python,
2023.
[SK23b] Paula Villasante Soriano and Cansu Kebabci. Scree plot for
pca explained: Tutorial, example & how to interpret. https:
//statisticsglobe.com/scree-plot-pca, 2023.

30

331
[Ste20] Doug Steen. Precision-recall curves. https://medium.com/
@douglaspsteen/precision-recall-curves-d32e5b290248, 2020.

[Tal23] Md Alamin Talukder. Cic-ddos2019 dataset. https://data.


mendeley.com/datasets/ssnc74xm6r/1, 2023.
[Zen23] Zenarmor. Dos and ddos attacks. what are their differences?
https://www.zenarmor.com/docs/network-security-tutorials/
dos-vs-ddos-attacks, 2023.

31

332
Can Behavioural Economics Help
Explain Gender Disparities in Labour
Markets?

Jumaina Fatima
October 15, 2023

Abstract
The presence of pervasive gender disparities integrated into our labour
market outcomes (of promotion, pay, and hiring) poses a threat to our
current market functions. This paper determines the potential of be-
havioural economics in bridging these disparities in the context of labour
market outcomes. Drawing on insights from the subject, the paper ar-
gues that gender disparities in labour market outcomes can be a result
of a variety of behavioural biases and heuristics that lead to sub-optimal
decision-making.
Using the behavioural lens the paper identifies and focuses on three
cognitive biases: the endowment effect, the overconfidence bias and the
status quo bias. The paper intends to propose interventions and policy
modifications to overcome such ubiquitous biases and correct the present
labour market climate.

1 INTRODUCTION
Two hundred eighty-six years is the daunting estimate given by the World Eco-
nomic Forum’s Global Gender Gap Report 2021 for women to achieve economic
parity with men. (World Economic Forum, 2021). This staggering and stag-
nated number when it comes to labour outcomes re-emphasizes the irrationality
of the ever-widening gender gap.
To account for this, significant data across the field helps us ascertain how
women in higher roles of leadership strongly correlate with firm growth, market
share, revenues, return on investment, productivity and profitability. Failing
to hire and promote women is irrational and inefficient (Martha Fineman &
Terence Dougherty, 2005). Among companies surveyed by the ILO that track
the impact of gender diversity in management, over two-thirds of companies
report 5-20% profit increases (ILO, 2019). Catalyst points out that companies
∗ Advised by: Edoardo Gallo

333
with women on their boards outperformed companies with zero women board
directors; by 84% return on sales, 60% return on invested capital, and 46%
return on equity (Catalyst, 2021). Even enterprises report improvement in
business outcomes due to gender diversity initiatives, over 60% report higher
profitability and productivity, 56.8% report increased ability to attract and
retain talent, 54.4% report greater creativity, innovation and openness, 54.1%
say that their company’s reputation has been enhanced, and 36.5% are better
able to gauge consumer interest and demand (ILO, 2019).
The inference of these statistics clarifies that hiring, promoting and retaining
more women in the existing competitive global market is of the utmost impor-
tance. It is imperative for organisations to view gender balance as a bottom-line
issue, not just a human resource issue (Deborah France-Massin, Director of the
ILO Bureau for Employers’ Activities, 2019).
Additionally, the International Labour Organization estimates that closing
the gender gap in participation by 25% before 2025 could increase global GDP
by US$5.3 trillion. Every 1% of female employment growth is associated with,
on average, annual GDP growth of 0.16% (Women in Business and Management:
The Business Case for Change. ILO, 2019).

Figure 1: Labour force participation in various countries

Among those aged 25 to 54, the gender gap in labour force participation
stood at 29.2% in 2022, with female participation at 61.4% and male partici-
pation at 90.6% (ILO, 2022). Globally, only about 18% of firms have a female
manager (Esteban Ortiz-Ospina and Max Roser, 2018) and on an overall scale,
women hold 19.7% of seats worldwide with only 6.7% chairing boards (Deloitte,
2021).
With the evidence of these arguments and statistics, two vital points have
been established. Firstly, the proliferation of women’s labour force participation
has created a significant impact on organisations by contributing to an increase
in profitability and the global GDP. Secondly, hindrances to the advancement of
women due to unstructured and loosely targeted policies towards attaining the
required gender parity in current labour markets, organisations and economies
seem to be exhibiting irrational behaviour.

334
Figure 2: Source: Global Gender Gap Report, 2022.

The rethinking of neoclassical economic theories, such as rationality, is the


central theme of behavioural economics; using a behavioural lens to understand
and analyse this irrationality is essential to curbing the problem at hand. This
paper utilises the study of behavioural economics to provide an in-depth anal-
ysis of how gender disparities in the labour market outcomes for women have
come to exist. The paper in its first section analyses the existing literature
on behavioural economics and cognitive biases and heuristics. Next, the paper
identifies the behavioural biases and heuristics, namely: the endowment effect,
the overconfidence bias, and the status quo bias, commonly seen in the labour
market that create gender disparities. Similarly, the paper uses insights from
behavioural economics to suggest behavioural nudges and interventions to heal
this gender gap in labour markets.

2 LITERATURE REVIEW
2.1 Behavioural Economics- The Psychological Branch of
Economics
The long-standing assumption of human rationality made by neoclassical economists
to simplify their models was challenged by Daniel Kahneman and Amos Tversky
leading to the revolutionary foundation of a much more complex, comprehensive
and intricate branch of study. The amalgamation of psychology, neuroscience,
and economics, behavioural economics is an attempt to put the study of eco-
nomic decision-making onto a firm scientific basis. (David Orrell, 2021). “Be-
havioural Economics is the study of how people make decisions, not how they
should make decisions” (Thaler, 2015). This branch of economics dimensional-
ized the study of economics, expanding the horizons of human understanding of
the act of taking decisions and making choices. Human behaviour, contrary to
the neoclassical economic theory, is not always motivated by rationality. The
human mind eludes certain cognitive heuristics and biases that pose limitations

335
to our decision-making. The gender disparities that we see today in the labour
market are evidently irrational as seen through the tangible data presented ear-
lier. This application of a behavioural economic approach helps us discern the
outcomes caused by cognitive biases when observed through women’s labour
force participation.

3 BEHAVIOURAL ECONOMICS ANALYSIS


3.1 The Endowment Effect
People often demand much more to give up an object than they would be willing
to pay to acquire it (Thaler, 1980).
Kahneman, Knetsch, and Thaler (1990) ran a new series of experiments to
determine whether the endowment effect survives when subjects face market
discipline and have a chance to learn. To sum up the experiment, the partici-
pants were divided into two groups of buyers and sellers; both groups were asked
to set a selling price and a buying price respectively. The experiment revealed
that the median selling price was almost twice the median buying price and the
volume of trade was less than half of the expected (Kahneman, Daniel, Jack L.
Knetsch, and Richard H. Thaler, 1991). Kahneman and Richard Thaler’s ex-
perimentation with something as trivial as brief ownership of coffee mugs shows
that humans would resist giving up their entitlement in rational exchanges, at
least in part because of loss aversion, as well as status quo bias (Heckbert,
2018). The main effect of the endowment is not to enhance the appeal of
the good one owns, only the pain of giving it up (Kahneman, Daniel, Jack L.
Knetsch, and Richard H. Thaler, 1991).
Something as insignificant as the ownership of a mug raised such fervent
feelings of endowment, one can only imagine what it would be like to give up
leadership roles that come with privileged, wealth and advancement opportuni-
ties. Historically, under the status quo, white males are more likely to occupy
or advance into leadership roles. This is why the endowment effect may help
explain why progress toward gender parity in leadership is stalling or stalled.
(Heckbert, L, 2018)
The endowment effect and bounded rationality seem to be interconnected
in this instance. Some people lack awareness of the extent and severity of the
gender gap in advancement. This bounded rationality impedes their capacity to
question the status quo or challenge the endowment effect. (Heckbert. L, 2018).
Misjudged and uninformed people might take this pursuit to attain gender parity
as a fight that has been started against them. This leads to men exhibiting
unwanted reluctances and insecurity when it comes to hiring or promoting more
women to roles they traditionally held. McKinsey and Company’s 2017 report
finds that women are 18 per cent less likely to be promoted than their male
peers globally, as women fall behind early and lose ground with every step of
the pipeline (Krivkovich et al., 2017). Some men even feel that gender diversity
efforts disadvantage them: 15 per cent of men think their gender will make it

336
harder for them to advance, and white men are almost 50 per cent more likely
than men of colour to think this”(Women and Workplace, 2017).
Men feel threatened by efforts towards gender parity due to their ignorance;
their fear of being replaced and facing losses lead to implicit biases creeping
in on the part of men and affecting the outcome. These implicit biases men
conceive have a significant impact on outcomes for women since they are the
ones present more prominently in higher managerial and leadership roles, ac-
counting for 62% of C-suite roles and about 70% of senior management. An
instance of implicit bias creeping into decisions, strengthening the feeling of the
endowment, is when a manager might view a male employee who is assertive
and self-promoting as having ”leadership potential,” while viewing a female em-
ployee who exhibits the same behaviours as ”pushy” or ”abrasive.”(Rudman, L.
A., & Glick, P., 2010) Similarly, research on gender and leadership has found
that female leaders who attempt to establish their authority in a traditionally
masculine (e.g., authoritative or directive) manner are evaluated more harshly
than their male peers (Eagly, Makhijani, & Klonsky, 1992). Perhaps in response
to this resistance, women have tended to develop a more participative leadership
style, which is correspondent with prescriptive gender roles for women (Eagly
& Johnson, 1990) and is more effective for them than traditionally male lead-
ership styles (Eagly, Johannesen-Schmidt, & van Engen, 2003; Eagly, Karau, &
Makhijani, 1995).
Men often see the issue of the gender gap framed as their need to protect
themselves against losses. Particularly when added to the other two bounds,
namely, bounded rationality and bounded willpower, the cumulative effect of
these cognitive and behavioural factors underscores how crucial it is to intervene
to correct this gender gap and the perceptions and behaviours surrounding it
(Heckbert. L, 2018).

3.2 The Overconfidence Bias


“Overconfidence bias refers to the tendency of individuals to overestimate their
abilities or the accuracy of their beliefs and judgments. This bias can lead in-
dividuals to take on more risk than is warranted, to make overly optimistic
predictions, and to overestimate their performance on tasks. ( Kahneman, D.,
& Tversky, A., 1973).”
In the light of social norms and stereotypes associated with behaviours and
expectations from women, they show a stark overconfidence bias in undervaluing
their work and skills. Studies have shown that, compared to men, women are
less likely to initiate salary negotiations and ask for higher pay (e.g., Small et
al., 2007). According to gender role congruence theory, the female gender role
is inconsistent with the negotiator role so women may believe asking for higher
pay would violate the social expectations that “good girls don’t ask” (Babcock
and Laschever 2003; Eagly and Sczesny, 2019).
The analysis of the working of this bias becomes imperative while talking
about the gender pay gap. Globally, women made only $0.68 for every dollar
men made in 2020 (World Economic Forum, 2020). Women’s reluctance in

337
asking for higher pay due to the undervaluation of their skills results in low
starting salaries. The gender gap in starting salaries is then amplified over years
through pay raises that mostly use starting salaries as a base and consequently
lead to larger gender pay gaps in the long run.
Previous studies on gender differences in initiating salary negotiation find
that, compared to men, women are more likely to feel anxious and less entitled
during negotiations (Bowles, Babcock and McGinn, 2005). If the expected
economic gains were large enough to outweigh the social costs, then the rational
course of action would be to initiate negotiations, in spite of the social costs
(Bowles, Linda Babcock, and Lei Lai, 2007). Referring to the study: “Social
Incentives for Gender Differences in the Propensity to Initiate Negotiations:
Sometimes It Does Hurt to Ask”, by Bowles, Hannah Riley, Babcock, Linda,
and Lai, Lei, there are two things that can be inferred. Firstly, women are
overconfident in undervaluing their potential economic gains which could be
a result of defying social norms. Secondly, women’s reluctance in comparison
to men’s to initiate negotiations over resources, such as compensation, may be
traced to the higher social costs that they face when doing so. A possible
explanation for the lack of entitlement women feel during a salary negotiation is
the serious undervaluation of their work. They tend to be overconfident in their
inference of their work’s worth. This cognitive bias, like many other biases,
stems from social norms and stereotyping. Social norms play essential roles
in people’s economic behaviours (Kray, Galinsky, and Thompson, 2002; Li, De
Oliveira, and Eckel, 2017). Traditional social norms prescribe that women are
generally expected to demand and accept less and give away more (Bowles,
Babcock, and Lai, 2007).
Individuals who believe their performance is better than others are more
likely to ask for a pay raise. Equity theory suggests that people compare their
own input/outcome ratios with others’ input/outcome ratios, and would try to
restore the balance if their ratios are higher than others (Huseman, Hatfield
and Miles, 1987). Rationally, women should be negotiating for their work’s
worth, but overconfidence bias amalgamated with social norms, becomes a cog-
nitive hindrance for women to do so. This identification of this bias and using
behavioural interventions to fix it can result in a significant improvement in
labour market outcomes for women concerning the reduction of the gender pay
gap.

3.3 The Status Quo Bias


Status quo bias refers to the tendency to favour or maintain the current state of
affairs, even in the presence of good reasons to change it. This bias can affect
human behaviour in various situations, sometimes with beneficial effects but
often with deleterious ones. It is often difficult to determine whether a particular
status quo is worth preserving or whether a change would be beneficial, and there
is no simple answer to this question. However, it is clear that people have a
strong preference for the current state of affairs, and that this preference can
have important consequences for decision-making and behaviour (Samuelson,

338
W., & Zeckhauser. R, 1988).
The working of this bias is most at play for women’s labour outcomes when
it comes to their promotion to higher levels of leadership roles. According to a
Peterson Institute for International Economics study, only 5% of CEOs in the
S&P 500 are women. Similarly, a 2019 survey by McKinsey & Company found
that women hold only 21% of C-suite positions in the United States.
According to social role theory, women face stereotyping perceptions because
of their multiple social roles. The social role theory examines the causes of sex
differences and similarities in social behaviours. It also argues that gender di-
vision of labour leads to the gender stereotypes which characterise a society
(Eagly, 1987). The inherent status quo becomes the ideologies of patriarchy
and separate spheres leading to the underrepresentation of women in roles of
leadership that they are more than competent of holding. The failure of organ-
isations to budge from their stance on these outdated notions and not taking
into account the multifaceted roles that women play in societies not only causes
significant economic and social costs for women but also results in a huge loss
of economic opportunity for organisations.
If decision-makers have a preference for promoting individuals who resemble
those who have been successful leaders in the past (i.e., the status quo), this
can lead to disproportionate and underrepresentation of women in leadership
positions as well as leave little to no room for women to be considered for
higher-level promotions.
Merit-based pay and promotion programs or meritocracy have long been
used by organisations as affirmative action for diversity policies. Meritocracy has
been culturally accepted as a fair and legitimate distributive principle in many
advanced capitalist countries and organisations (Scully, 1997, 2000; McNamee
and Miller, 2004). However, a study by Castilla and Benard (2010), found that
companies that emphasised meritocracy in their promotion decisions actually
exhibited more significant bias against women and minority groups than those
that did not emphasise meritocracy. The key hypothesis of the study establishes
that managers making decisions on behalf of organisations that emphasise mer-
itocracy ironically showed more significant bias in favour of men over equally
performing women. This happens in part because the culture of meritocracy
unintentionally triggers managers’ stereotypes and other schemata while mak-
ing employment decisions (Swidler, 1986; DiMaggio, 1997). This paradoxical
finding suggests that the emphasis on meritocracy may actually reinforce the
status quo bias. When an organisation is explicitly presented as meritocratic,
individuals in managerial positions favour a male employee over an equally qual-
ified female employee by awarding him a larger monetary reward (Castilla, E.
J., & Benard, S, 2010).
The patriarchal notion of men being perceived as the ultimate leader of
society, and consequently businesses, set the status quo bias in place that now
women need to fight in order to climb up the corporate ladder to reach higher-
level positions, such as that of directors or CEOs. Female candidates do not
resemble the stereotypical notion of directors and leaders. Schein’s research has
shown that, in the UK, Germany, China, Japan and the US, men associate

339
the attributes needed for leadership with men but not with women (Schein,
Virginia E., et al., 2000). This was dubbed the “think manager, think male”
phenomenon. These ideologies are also repeatedly seen in the form of glass
ceilings. Glass ceiling implies blockages or barriers so invisible that they create
obstacles for females and other minority groups as they try to rise to upper
management positions (Morrison, 1980).

4 BEHAVIOURAL INTERVENTIONS
Keeping in mind the heuristics identified that play a significant role in the gender
disparities of labour markets, this section of the paper suggests behavioural
interventions that can potentially be used to curb, or at least, nudge the problem
at hand.
Firstly, this paper suggests the use of explicit rules during the hiring process.
As discussed in section III of the paper, women hesitate to lead or even initiate
negotiations for their salaries. They fear that their outwardness and bold ask
for their work’s worth would make them less likeable, less hirable and rude. An
experimental study (Lin Xiu et al., 2022) examined how explicit pay raise rules
affect men’s and women’s initiations of salary negotiation differently. Their
results showed that when pay raise rules are explicitly stated, women are less
reluctant to ask for a pay raise. The explicit rule effect seems to work well,
particularly for women with above-average task performance. A clearly stated
rule frees women from concerns that their asking decisions might be perceived
as socially less acceptable and that starting salary negotiations conflicts with
their internalised social norm. (Lin Xiu et al., 2022). This would finally let
women infer the value of their work without the constraints of social and gender
norms. Using the behavioural tactic of framing and explicitly stating that
wages are up for negotiations, organisations can not only empower women to
start salary negotiations but also increase women’s trust in the organisation’s
pay raise process, thereby retaining talent longer. Women would be certain that
their work and talent are valued and that their future career advancements are
assured.
Secondly, acknowledging the endowment effect, it is crucial that organisa-
tions carefully depict their stances on diversity and gender parity. This pursuit
is for equality. Women’s better labour market outcomes do not mean adverse
outcomes for men. These two events are not mutually exclusive. “If men believe
their organisations prioritise gender diversity because it leads to better business
results, they are significantly more likely to think it matters. . . . [W]hen men
think companies prioritise gender diversity because it is ‘fair to all people,’ they
are more likely to be personally committed.”(Women and Workplace, 2017) It
is important that organisations make active efforts to curb any ignorance and
misinformation on the part of their male employees.
The most ideal way to strive for better labour outcomes for women is through
gender parity and inclusion, yes, but also by making sure that the competent
human capital is fully at use to accelerate economic development and amplify

340
economic prosperity. This could potentially be a better way of framing the
pursuit of gender parity in labour markets. If businesses provided concrete
examples of economic results achievable with increased diversity and then de-
scribed that economic opportunity in a manner that activated employees’ loss
aversion biases, this could help increase male employees’ prioritisation of diver-
sity in leadership (Heckbert. L, 2018). There are numerous studies and reports
to support the claims that increased women’s participation in the labour mar-
ket leads to astounding outcomes. Through these studies, organisations can
successfully disguise gender parity advancement as economic opportunities and
business output advancements.
Thirdly, joint evaluation employees could help nudge out implicit biases that
people tend to harbour due to social norms. These evaluations could potentially
provide evidence-based rebuttals to any inherent stereotypical ideologies one
must possess. Bohnet et al’s research applied the behavioural economics find-
ing that “people make more reasoned choices when examining options jointly
rather than separately” to the process of employee evaluations. They found
that when jointly evaluated, individual performance drives evaluation decisions;
when separately evaluated, group stereotypes drive such decisions. Businesses
could opt for joint evaluation at each of the hiring, review and promotion stages
as a normative best practice and fairness mechanism.
Organisations could frame it as helping managers maximise profits and team
results, by ensuring consistent selection of higher-performing candidates. Fram-
ing such a procedure as a fairness mechanism should appeal to individuals’
bounded self-interest. Governments could nudge businesses toward adopting
joint evaluation procedures. This could be incorporated in a “comply or ex-
plain” regulation, using an information-based strategy to indirectly alter busi-
ness behaviour (Heckbert, L. (2018).

5 CONCLUSION
Through the behavioural economics lens approach that the paper has taken, the
three heuristics and their respective arguments have reinstated the irrationality
of the gender gaps in labour market outcomes.
Zooming into the endowment effect, it clarifies that paving the path to po-
sitions of authority for women or at the least considering competent women for
these roles is felt as a devastating loss to those already concentrated in high
numbers at the top of the ladder. As suggested in the intervention, this can be
solved by framing the promotion and advancement of women as business and
economic opportunities.
This paper takes a unique approach to the application of the overconfidence
bias in the context of gender disparity in labour market outcomes. Historically,
women have been economically disadvantaged due to the patriarchal norms em-
bedded in the very essence of society. In a situation such as this, where social
norms are against your favour, it’s difficult to value one’s work and skills, es-
pecially if one has been conditioned to downplay their achievements and never

341
demand more. It is also seen that women are hesitant to negotiate their com-
pensations in situations where they believe that the value of their economic
gains is less than that of the social cost that comes with defying social norms.
Organisations can assist women in being better negotiators by imposing the
usage of explicit rules.
The resistance to hiring, promoting, and equitably compensating women
within organisations highlights the persistence of patriarchal notions that re-
inforce men’s leadership dominance. This section emphasises the critical need
for organisations to overhaul their policies and strategies, embracing gender
diversity as a fundamental aspect of their organizational culture.
Behavioural economics is the solution to understanding irrationality through
structured and well-defined heuristics and biases. The overlap of this study en-
sures the problem at hand is understood from all different perspectives, offering
reasons for the inherent issue of gender disparities in labour markets. Once
issues at hand are understood vastly, the curbing and fixing become much more
simplified.

References
[BBL07] Hannah Riley Bowles, Linda Babcock, and Lei Lai. Social incen-
tives for gender differences in the propensity to initiate negotia-
tions: Sometimes it does hurt to ask. Organizational Behavior and
human decision Processes, 103(1):84–103, 2007.

[CB10] Emilio J Castilla and Stephen Benard. The paradox of meritocracy


in organizations. Administrative science quarterly, 55(4):543–676,
2010.

[FD18] Martha Fineman and Terence Dougherty. Feminism confronts


homo economicus: gender, law, and society. Cornell University
Press, 2018.
[FN20] Marianne A Ferber and Julie A Nelson. Feminist economics today:
Beyond economic man. University of Chicago Press, 2020.

[Hec18] Lori Anne Heckbert. Closing the gender gap in corporate advance-
ment: Insights and solutions from behavioral economics. Windsor
Yearbook of Access to Justice, 35:187–225, 2018.

[Hic14] Eleanore Hickman. Boardroom gender diversity: A behavioural


economics analysis. Journal of Corporate Law Studies, 14(2):385–
418, 2014.

[KKT+ 91a] Daniel Kahneman, Jack L Knetsch, Richard H Thaler, et al. The
endowment effect, loss aversion, and status quo bias. Journal of
Economic perspectives, 5(1):193–206, 1991.

10

342
[KKT+ 91b] Daniel Kahneman, Jack L Knetsch, Richard H Thaler, et al. The
endowment effect, loss aversion, and status quo bias. Journal of
Economic perspectives, 5(1):193–206, 1991.

[Off19] International Labour Office. Women in business and management:


The business case for change. Geneva: ILO, 2019.

[Orr21] David Orrell. Behavioural economics: Psychology, neuroscience,


and the human side of economics. Palgrave Macmillan, 2021.

[RG21] Laurie A Rudman and Peter Glick. The Social Psychology of gen-
der: How Power and intimacy shape gender relations. Guilford
Publications, 2021.

[RXH] Yufei Ren, Lin Xiu, and Amy B Hietapelto. Gender differences in
asking for pay raises: The role of explicit rules.

[SMLL96] Virginia E Schein, Ruediger Mueller, Terri Lituchy, and Jiang Liu.
Think manager—think male: A global phenomenon? Journal of
organizational behavior, 17(1):33–41, 1996.

[Tha15] Richard H Thaler. Misbehaving: The making of behavioral eco-


nomics. WW Norton & Company, 2015.

11

343
Using Data-Efficient Image Transformers for
Diabetic Retinopathy Severity Classification


Veda Fernandes
October 17, 2023

Abstract
Roughly 10% of the global adult population is diabetic, diabetes is
a metabolic condition which results in chronically high blood sugar lev-
els. Patients with diabetes are at substantially higher risk for several
serious health conditions including diabetic retinopathy (DR). DR is a
vision-threatening disease which affects 35% of diabetic patients and is
projected to affect 160 million people by 2045. Diabetic patients should
be screened for retinopathy every one to two years; however, in many
countries patients are not regularly screened and therefore not treated.
Globally, the lack of rapid and cost-effective screening strategies for DR
leads to underdiagnosis and loss of vision. Machine learning tools of-
fer a solution in developing automated models to diagnose DR from eye
fundus images. In published literature, convolutional neural networks
(CNNs) are the state-of-the-art model for classification of DR. More re-
cently, transformer models have been applied and shown superior perfor-
mance. Text transformer models have resulted in the proliferation of tools
such as ChatGPT, which provide contextual understanding and ability to
identify dependencies. In this study, we perform a head-to-head compar-
ison between CNN and vision transform models for classifying DR. We
demonstrate that transformer models diagnose DR with a substantially
higher accuracy, ranging up to 13% as measured by the F1 performance
metric. Furthermore, we identify optimal training parameters for diagno-
sis of DR, training a total of 19 machine learning models reaching a test
set F1 score performance of 90% on a dataset of 35,130 fundus images
with 20% of images withheld for independent testing.

1 Introduction
Diabetic retinopathy (DR), is a vision-threatening microvascular disease caused
by significant damage to the blood vessels in the retina and is one of the most
frequent complications of diabetes mellitus [NPS22]. DR is a leading cause of
∗ Dubai International Academy Emirates Hills
† Advised by: Dr. Parsa Akbari, University of Cambridge

344
preventable blindness and vision impairment among the working-age population,
with a prevalence of about 35% among those with diabetes mellitus. By 2045,
it is estimated that 783 million people will be diabetic [Fed21] and 160 million
people could be affected by DR [TTY+ 21]. The prevalence increase of DR by
2030 is notably concentrated in the low and middle income countries in regions
such as Asia, South America and the MENA. [TW22]. This disease burden will
require effective DR screening strategies to align with the changing demographic.
About 56% of new cases could be reduced with timely monitoring of severity
and treatment [TMS20].
DR can be graded into 5 stages according to morphological changes that oc-
cur in the retina as the disease progresses: non-proliferative diabetic retinopathy
(NPDR), mild NPDR, moderate NPDR, severe NPDR and Proliferative Dia-
betic Retinopathy [DSS17] (Fig. 1). Screening is imperative to identify the
stage of DR - with timely referral, the progress of DR can be slowed and severe
vision-impairment can be prevented [WSK+ 18]. Despite this, there is a shortage
of ophthalmologists to screen millions of retinal images for each diabetic patient,
especially in developing countries. [RLW+ 20] [WSK+ 18]. Here, technology can
provide an alternate solution to traditional screening methods by physicians by
reducing the cost and manpower required to screen patients for PDR.
Ocular telemedicine is a concept which has been proposed to make DR
screening more cost efficient. This involves local clinics sending images of the
retina to a central ‘grading center’ where experts can grade the level of DR
severity [HSCA16]. Hand-held imaging devices are another solution and have
achieved high specificity and sensitivity compared to traditional retinal cam-
eras [PDKB22]. However, in both solutions, trained clinical professionals are
still required to analyze the retinal images. An automated system would con-
duct the initial screening of retinal images to detect signs of DR even when
ophthalmologists are unavailable.
An automated system must recognize the changes to retinal vasculature
caused by the various stages of DR as seen on fundus images. As seen in Fig. 1,
DR affected retinal images show characteristic color and patchy variations on the
fundus image, due to morphological changes in the retina. These changes include
lesions called MicroAneurysms (MA), Hard EXudates (HEX), Soft Exudates
(SE), HEMorrhages (HEM), and an increase in blood vessels. MAs are seen in
one quadrant in mild NPDR, and it progresses to vessel blockage, and presence
of lesions in moderate NPDR. Severe NPDR presents with venous beading and
a large number of HEMs. This is a precursor to ‘neovascularization’ of the PDR
stage [NPM+ 22], or to the formation of new blood vessels on the retina, which
may eventually lead to blindness (Fig. 1).
In recent decades, artificial intelligence has been trained to classify DR stages
from fundus images. Early models used ML-based classifiers like Random Forest
(RF) ( [CSC+ 14], K-Nearest Neighbours (KNN) [NvGRA07] , Support Vector
Machines (SVM) [SAFL10] and Artificial Neural Networks (ANN) [UDH+ 04].
These methods required efficient prior hand-engineered feature extraction, which
could introduce errors into complex fundus imaging. [NG18] evaluated 7 auto-
mated retinal image analysis (ARIA) systems to classify DR. These models had

345
Figure 1: Images of the retina showing stages of diabetic retinopathy, graded
into 5 classes ranging from 0-4, as per the EyePACS dataset [GPC+ 16].

sensitivities of 87%-95%, but had limited specificities of 50%-69%, thereby lead-


ing to a high number of false positives that negatively impacted the clinical
application of these systems [NG18].
Deep learning (DL), also known as deep neural networks, is an emerging field
which relies on multiple layers of processing to extract high level features from
input images. The application of DL was previously limited by the availability of
computational processing power. However recent innovations including cheaper
compute costs, cloud computing, and specialized ASIC’s which are processing
units optimized for ML, have made DL more broadly available. DL has a wide
range of applications from computer vision and finance to autonomous vehicles
[DDD20]. It has also been applied in medicine and in recent years has produced
favorable results in the diagnosis of DR in fundus imaging, with sensitivities of
80.28% to 100.0% and specificities of 84.0% to 99.0% [NLA+ 19].
Convolutional Neural Networks (CNNs) are popular in deep learning for
image classification tasks, including medical imaging [BR18]. The design of
CNNs maximizes their efficiency by focusing on a smaller section of the data;
however, this compromises their performance when capturing broader patterns
and relationships across spaced out parts of an image [SKZ+ 23].
A CNN of the Inception-v3 architecture [GPC+ 16] and CNN-based residual
learning [GL17] were used to detect DR. To classify fundus images into DR sever-
ity stages, the literature shows several CNN architectures were used. A CNN was
used to identify exudates [YXK17], transfer learning and hyperparameter tuning
was done on the CNNs of AlexNet, VGGNet, GoogleNet and ResNet [WLZ18],
a DenseNet architecture with hyperparameter tuning was trained [RPC+ 20] and
an attention mechanism coupled with a modified DenseNet-169 architecture was
used [FFAH22].

346
Recently, transformer models have gained popularity in a variety of applica-
tions. An example of this rising interest in transformers is ChatGPT - an ML-
tool which uses a transformer architecture to do NLP [Ope23]. Transformers
are efficient in identifying and understanding the relationship between separate
elements within data and are capable of parallel processing, which means they
are able to be trained more rapidly [IEE+ 23]. In addition to text, the success
of transformers in Natural Language Processing (NLP) prompted the creation
of vision transformers (ViT) which can be applied to image data to carry out
computer vision tasks [HGL+ 23]. These traits make ViTs adept at doing tasks
that require an understanding of the context, thus making them valuable not
only in the field of NLP but also in medical imaging classification and segmen-
tation [SKZ+ 23]. Consequently, ViTs may be a prospective model architecture
for DR detection through fundus images.
[WHX+ 21] and [AAKJTT+ 21] demonstrated that attention-based ViTs
provide high accuracy for DR classification. [AKCS23] used an ensemble of
Vision Transformers (ViT), Data efficient image Transformers (DeiT), Bidirec-
tional Encoder representation for image Transformer (BEiT) and Class-Attention
in Image Transformers (CAIT) to stage DR severity. Transformers gained
prominence only recently and hence, there is limited literature on the effect
of hyperparameters of ViTs as compared to CNNs, which have been studied
much more rigorously. Therefore, this study aims to compare the performance
of a ViT and CNN architecture and find the optimal hyperparameters for a ViT
model to classify DR fundus images.
In this study, we benchmarked DeiT, a recent ViT model, against ResNet-
18, a classical CNN model, for binary classification of retinal fundus images
into no or mild nonproliferative DR and moderate nonproliferative or more
severe DR. This classification was chosen as it is recommended by the American
Diabetes Association and the International Council of Ophthalmology that cases
of moderate NPDR or more severe stages are referred to an ophthalmologist
[WSK+ 18] [SCD+ 17]. Therefore, the model would be used as a preliminary
screening tool to help refer patients while the specific diagnosis of the severity
of DR could be done by a medical professional upon consultation. We used the
EyePACS Dataset, consisting of 35,130 fundus images [GPC+ 16].
To address the disparity in literature for the application of the transformer
models in DR classification, we identified the optimal hyperparameters for the
DeiT architecture and compared its performance to ResNet-18. The models
were evaluated on their F1 scores, which is a metric that measures the har-
monic mean of the precision and recall of the model, penalizing any extreme
values from either [HST+ 22]. The results indicated that the transformer mod-
els showed superior performance across a range of hyperparameters including
learning rate and batch size. The F1 scores showed that the DeiT model per-
formed considerably better than the ResNet-18 model and that a learning rate
of 1E-04, batch size 32, and epoch of 6 gave the best model performance.

347
2 Results
2.1 DeiT model performance showed 23% improvement in
F1 score with optimal learning rate
We investigated the impact of learning rates on the performance of two model
architectures - DeiT and ResNet-18. DeiT is a vision transformer while ResNet-
18 is a convolutional neural network. The learning rate governs how quickly a
model learns. A higher learning rate allows for the model to take larger steps
to improve, however this may result in the optimal solution being overshot. In
contrast, smaller learning rates may eventually reach a desirable result but are
less time-efficient. To find the optimal value for the learning rate, each model
was trained with 5 learning rates: 1E-03, 1E-04, 1E-05 and 1E-06. Both the
ResNet-18 model and the DeiT model showed a similar trend in performance
based on learning rates (Fig. 2).
For the DeiT model, learning rates of 1E-04 and 1E-05 demonstrated superior
performance showing 40% improvement in test F1 score compared to higher
learning rates. However, learning rates lower 1E-04 and 1E-05 deteriorated the
ability of the model to detect moderate NPDR or more severe cases with a 7%
decrease in model performance when tested with a learning rate of 1E-06. For
the ResNet-18 model, similar to the DeiT model, the learning rates of 1E-04 led
to better overall model performance. Below and above those values, the model
performance was much poorer.
A significant finding was that the DeiT model consistently outperformed
ResNet-18 across all learning rates (Fig. 2), with a 13% higher Test F1 score.
ViT generally excels in understanding the bigger context in images as compared
to CNN models, which could explain the results.

Figure 2: Graphs showing the effect of different learning rates on the perfor-
mance for the DeiT and ReNet-18 models. This figure shows the Test F1 scores
for the no or mild NPDR and moderate NPDR or more severe DR categories
and overall cost of the DeiT and ResNet-18 models. The DeiT model generally
performs better than the ResNet-18 model, with the F1 score being higher on
average and the precision being lower. It can also be seen that the optimal
learning rate for both models is 1E-04 as the F1 graphs peak while the cost
function is low at that point.

348
2.2 Batch size of 32 improved model test F1 score by 50%
Deep learning models are trained with the stochastic gradient descent algorithm
which performs each iteration of training using a single batch of data. Therefore,
at each training iteration, improvements in model performance are incremental
and dependent only on images present in the batch. Batch size is a critical
training parameter which determines the number of images the model is trained
or tested on in each iteration. Finding the correct batch size is important,
as this fundamentally affects the training of the model. Larger batches result
in a greater amount of information informing each training iteration; however,
larger batch sizes are computationally intensive and are slow to execute. Smaller
batch sizes are computationally efficient; however, less information informs each
training iteration. Furthermore, smaller batch sizes lead to fluctuations in model
training which may be beneficial in searching the model parameter space and
resulting in superior performance, or may lead to ineffective training iterations
and poor performance.
We experimented with a range of values, starting at 8 and increasing in
factors of 2, testing batch sizes of 8, 16, 32, and 64 to assess their impact on the
model’s performance. We found that larger batch sizes tended to improve model
performance, with a batch size of 32 giving the best overall performance (Fig.
3). While increasing batch size to 64 increased the model’s ability to identify no
or mild DR, it negatively impacted its ability to diagnose moderate NPDR or
more severe DR by 10%, which is counterproductive to the aim of this model.

Figure 3: Graph showing the effect of different batch sizes on the performance
of the DeiT model. This figure shows the Test F1 scores for the no or mild
NPDR and moderate NPDR or more severe DR categories and overall cost of
the DeiT and ResNet-18 models. The DeiT model generally performs better
than the ResNet-18 model, with the F1 score being higher on average and the
precision being lower. It can also be seen that the optimal learning rate for both
models is 1E-04 as the F1 graphs peak while the cost function is low at that
point.

349
2.3 Epoch of 6 improved test F1 score by 7%
Epochs control the number of times the model iterates through all the training
fundus images, where a single iteration over all images in the training set is
one epoch. Training the model across multiple epochs often results in superior
performance as the model parameters are further improved from the training
examples. The intention of model training is for the model parameters to be
tuned to detect patterns in the training set which will be predictive of DR in
future fundus images. However, training across too many epochs will result in
overfitting, because the resulting model is over-adjusted for the training set and
identifies patterns which are specific to the peculiarities of the training set but
do not generalize to new data. Overfitting is detected by assessing the model
on a testing set which is not utilized for model training.
An ideal number of epochs avoids overfitting or underfitting by having either
too large or too small a number of epochs. To determine the optimal number of
epochs, the DeiT model was tested on epochs of 2, 4, 6, 8 and 10. There was no
significant relationship between the number of epochs and model performance,
but the model achieved the highest average test F1 Score of 90% and 70% for
each test class with 6 iterations (Fig. 4). The model performance decreased by
an average of 7%, above and below 6 epochs.

Figure 4: Graph showing the effect of different epochs on the performance for
the DeiT model. This figure shows the Test F1 scores for no or mild NPDR
and moderate NPDR or more severe DR categories and overall cost of the DeiT
models. The optimal epoch was identified as 6 epochs as it showed the best F1
and cost results.

3 Discussion
In the study we have demonstrated that vision transformer models show su-
perior predictive performance in diagnosis of DR compared to classical CNNs,

350
with a 13% higher test F1 accuracy. We identified the optimal values for the
DeiT model parameters such as learning rate, batch size and number of epochs.
We analyzed the test F1 score for 19 machine learning models and found that
hyperparameters learning rate of 1E-04. Additionally, a larger number of epochs
of 6 and larger batch sizes of 32 proved to show better model performance. The
DeiT model effectively classified images into both classes. A challenge that we
encountered was the imbalance in the number of images in each category in the
EyePACS dataset. There were over 22,000 images in the no or mild NPDR cat-
egory and approximately 5400 images within the moderate DR or more severe
DR category. Although the classes were weighted to reduce the effect of the
imbalance, there was still a significant difference in the ability of the models to
classify the images - the model performed 20% better at classifying images into
no DR or mild NPDR than the moderate or more severe cases. However, overall,
the DeiT model performed better than the ResNet-18 model, demonstrating the
promise of vision transformers in DR classification applications. DeiT models
can potentially reduce the need for manual screening.
Previous works largely focused on CNNs, with only a few papers on the
applications of ViTs to DR classification [WHX+ 21] [AAKJTT+ 21] [AKCS23]
. Recent research has shown that transformers provide significant advantages
compared to CNNs, including providing contextual understanding and making
connections between disparate features in the input image [MDB23]. However,
there have not been direct comparisons between the performance of ViT and
CNN models for diagnosis of DR. In this study, we aim to bridge this gap by
doing a comprehensive comparison between a classical CNN and a recent ViT
model.
Expanding on the work in this paper, DeiT models can be applied to multi-
class DR classification tasks based on the severity of clinical symptoms. How-
ever, to accomplish this work with high accuracy and to improve the feature
extraction capabilities, a larger dataset can be used, with a more balanced num-
ber of images for each class. More preprocessing techniques can be explored to
arrive at better convergence of the loss function. The high imbalance in classes
can be further addressed by custom data augmentation techniques such as rota-
tion, adjusting brightness, contrast etc. of the images [AKCS23]. Additionally,
variable illuminations and saturations of the images are a barrier to accuracy of
predictions. Luminosity normalization is a pre-processing technique that could
be applied to the images to improve the problem of variable illuminations. To
deal with zero pixels, using the generic cropping functions may cause a loss of
image data and disturb fundus geometry. Using a custom cropping window of
variable lengths depending on the resolution of the images will preserve crucial
information which will help in convergence of the loss function [RPC+ 20].

4 Methods
The Pytorch Package in the Visual Studio Code environment was used to run
the models. The experiment was run on fundoscopy images from the EyePACS

351
database using two deep learning networks - DeiT, which is a vision transformer
and ResNet-18, which is a convolutional neural network.
Fundoscopic examination is a routine clinical examination of the retina using
an ophthalmoscope, and is used to detect numerous eye diseases including dia-
betic retinopathy. EyePACS is a telemedicine healthcare provider which offers
diabetic retinopathy screening solutions in the United States. The EyePACS
dataset includes 5 million fundoscopy images of a diverse population of healthy
patients and those with various stages of diabetic retinopathy [GPC+ 16]. 35,130
EyePACS color fundus images were utilized for the analysis, with 28,102 images
set for training and 7,028 for testing. We ensured that all pairs of eyes from a
single patient were kept together in the training and testing sets to ensure the
testing set is independent from the training set.
While the demographic variables for the EyePACS dataset are not published,
an EyePACS dataset of 9963 images from 4997 patients used in a paper had an
average age of 54.1 (±11.3 ) years, with 62.2% women [GPC+ 16] The EyePACS
fundus images are labelled in five classes as Normal (0), mild (1), moderate
(2), severe (3), and proliferative DR (4) as shown in Fig. 1. Each image was
rated by a clinician for the stage of DR present according to the International
Clinical Diabetic Retinopathy severity scale (0-4) [GPC+ 16]. For our model, we
divided these classes into 2 categories based on recommended screening strate-
gies [WSK+ 18] [SCD+ 17].
The performance of a deep learning model will be dependent on the quantity
and integrity of the dataset used for training the model [ZLLS21]. The EyePACS
dataset images are of various sizes and have pre-existing image noise and incon-
sistencies including the presence of artefacts, images being out of focus, under-
exposed or overexposed [TPM23]. The images were acquired by various cameras
supported by the EyePACS platform, with field views of 40◦ − 50◦ [KOM+ 20]
and have different resolutions [TPM23]. There was no standard orientation of
the images and they could be inverted as well, making it difficult to tell left
from right eye images.
To improve the accuracy and reduce error rates of deep learning networks, a
set of operations for pre-processing the images was required to train the model
[ZLLS21]. Initially, the data was pre-processed by first converting them to
tensors, then resizing the images to a 224 x 244 resolution and finally using the
Random Horizontal Flip transformation to add more variance to the dataset.
Training deep learning models for high performance requires efficient opti-
mization of the hyperparameters of the model. First we trained both DeiT and
ResNet-18 models across learning rates between 1E-06 to 1E-03. Increasing the
number of epochs or the batch size will ensure the model is adequately trained,
thereby converging to the optimal solution and improving the accuracy. The
DeiT model performed consistently better than the ResNet-18 model, over a
range of learning rates, thus the DeiT model was chosen for further experimen-
tation. We varied the batch size of the DeiT model from 4 to 64 and epochs
from 2 to 10. The run time for the model varied due to increasing the number of
epochs and decreasing the learning rate, which made the model take longer to
train. The metrics, including F1, precision and recall, were logged for both the

352
training and testing data. We generated a summary of the overall performance
of the model under various hyperparameters to be visualized as graphs. We
ran 13 different trials, making changes to the stated hyperparameters to iden-
tify which gave the best model performance. Every three iterations, the model
performance was evaluated using the testing dataset of 7027 images.

4.1 Residual Neural Networks (ResNets)


Experimental work has shown that deeper networks are crucial for better per-
formance but are more difficult to train and accuracy gets saturated beyond a
point. Residual blocks offer a solution to the problem. First, the skip connection
sets a short-cut to backpropagate the gradient as shown in Fig 5 . Second, due
to the skip connection, instead of fitting the original identity mapping H(x), the
stacked layers need to fit or learn only the residual mapping of F (x) = H(x)−x.

Figure 5: ResNet-18 architecture with skip connections in the identity blocks


and convolutional blocks. Each layer consists of identical convolutional net-
works. When the input and output dimensions differ, convolutional skip con-
nection is used, indicated in dotted lines.

If the identity mapping H(x) = x is the desired underlying mapping, the


residual mapping becomes F (x) = 0, which makes the learning process eas-
ier - the weights and biases of the upper weight layer have to be pushed to
zero. Residual blocks can forward propagate faster due to the skip connec-
tions [ZLLS21]. Thus, H(x) = F (x) + x, adding the output from previous
layers to later layers. yi = F (xi , Wi ) + xi represents each block, with xi and
yi representing the input and output vectors of the layer. xi+1 = f (yi ) where
f denotes a ReLU activation function. F is a residual function [HZRS16]. The
ResNet-18 architecture used in this paper has 18 layers. The first 2 layers are a
7 × 7 convolutional layer with 64 output channels and a stride of 2 , followed by
a 3 × 3 max-pooling layer with a stride of 2 . ResNet-18 has 4—residual block
modules. Each residual block has two layers, making F = W2 σ (W1 x), where σ
denotes a ReLU function (Fig. 6). Each block consists of 2 convolution layers,
batch normalization, and corresponding ReLU activations. The convolutional
layers have 3 × 3 filters. When the feature map size is halved, the number of

10

353
filters is doubled, and down-sampling is with a stride of 2. The end of the net-
work is a 1000-way fully connected layer with softmax and an average pooling
layer. The shortcut connections are introduced to each pair of 3 × 3 filters as
shown in Fig. 5 [HZRS15].
The standard ResNet residual block is called the Identity Block (Fig. 6).
It is modified into the Convolutional Block when the input activation does not
have the same dimension as the output – usually 1 X 1 convolutions are done,
with a stride of 2 to match dimensions (Fig. 7) [HZRS15].

Figure 6: Identity Block - When input and output dimensions are equal, no
additional layer in skip connection path

Figure 7: Convolutional Block - When input and output dimensions are not
equal, 1 × 1 convolutional layer added in skip connection path

4.2 Data-efficient Image Transformers (DeiT)


Transformer models were initially applied in natural language (text) process-
ing applications using an ‘attention’ mechanism which allowed modelling of

11

354
elements within sentences of text without regard to their distance in the se-
quence [VSP+ 17]. Vision transformers applied these concepts to image process-
ing and classification [DBK+ 20]. However, compared to classical CNNs, ViT
models depend on pre-training using large amounts of data. Data-efficient im-
age transformers aim to address the requirement for large amounts of training
data by using a knowledge distillation technique to train a modified transformer,
built on the ViT architecture proposed by [DBK+ 20], which transfers knowledge
from a larger CNN to a smaller model. This reduces the training requirement
to about 3 days and also requires less infrastructure. [TCD+ 20]
DeiTs are image transformers that propose a novel distillation procedure
built upon the transformer block proposed by [DBK+ 20] and are trained on the
ImageNet dataset only [TCD+ 20] . This contrasts with ViT which needs pre-
training on hundreds of millions of images of curated data from many datasets
showing high performance [DBK+ 20]. DeiTs are an effective method as they re-
quire lower volume of data and memory footprint for a given accuracy [AKCS23],
making them a less computationally expensive architecture.
The knowledge distillation procedure using attention is the central principle
of DeiT where the transfer of knowledge happens from ‘the teacher’ model to
the ‘the student’ model [TCD+ 20]. The ‘teacher’ is RegNet Y–16GF, a CNN
pre-trained on ImageNet. The student is a modified ViT architecture where the
output of the ‘teacher’ is passed as an input to the ‘student’.
In DeiT, a new distillation procedure is introduced where the teacher’s hard
decision is taken as the true label. DeiT decomposes each RGB image into a
series of N patch tokens of 16 × 16 pixels each and converts it into a linear
layer of 16 × 16 × 3 = 768 dimensional representation. A new distillation token
is included which interacts with the class and patch tokens through the stack
of transformer encoder layers (Fig. 8). The encoder layers contain Multi-head
Self Attention (MSA) and Feed Forward Network (FFN) modules [TCD+ 20]
[DBK+ 20]. The hard decision is illustrated below.
Let yt = argmaxc Zt (c) be the hard decision of the teacher. The task of the
distillation token is to reproduce the hard decision yt predicted by ’the teacher’
and the class token has to reproduce the true label y. DeiT’s loss function is
given by:
1 1
LhardDistill
global = LCE (ψ (Zs ) , y) + LCE (ψ (Zs ) , yt )
2 2
Zs and Zt are the logit functions of the student and teacher models. ψ is the
softmax function, and LCE is the cross-entropy loss. Distillation tokens and the
class token learn by back propagation and the distillation allows the model to
learn from the teacher output [TCD+ 20], and this process is more efficient and
requires less computational power than other vision transformers.
Following the training and testing of the chosen DeiT model, which shows its
efficient performance in classifying diabetic retinopathy fundus images, it holds
promise for application in telemedical or national screening programs to screen
for severity of DR.

12

355
Figure 8: DeiT distillation procedure - The distillation token interacts with
the class and patch token through the transformer encoders. The encoders in
DeiT consist of repeated layers of self-attention and feed-forward network (FFN)
blocks. The objective of the distillation token is to reproduce the teacher’s
prediction instead of the true label. The distillation and class tokens learn by
back propagation.

4.3 Performance Metrics


The F1 score is a measure of a model’s accuracy which takes into account both
precision and recall. Precision and recall calculate the percentage of diagnoses
which are correct and the percentage of DR patients which are diagnosed re-
spectively. Furthermore, we calculate the F1 metric per outcome category and
report all independently. The purpose of utilizing the F1 metric is to consider
both true positive and false negative results by the model.
2 × precision × recall
F1 =
precision + recall

5 Conclusion
The incidence of diabetes in the global population has reached 10%, patients
with diabetes are at high risk for diabetic retinopathy and should be tested
for DR every one or two years according to standard clinical guidelines. The
lack of rapid and cost effective methods for diagnosis of DR is a major limiting
factor for providing appropriate patient care [WWC+ 22]. Our study has shown
that recent vision transformer methods have superior performance to a CNN
model for the classification of DR. We have identified optimal model training

13

356
parameters for the DeiT architecture. Our work demonstrates the ability of
ViTs to improve the accuracy of automated DR classification.

References
[AAKJTT+ 21] Nouar AlDahoul, Hezerul Abdul Karim, Myles Joshua
Toledo Tan, Mhd Adel Momo, and Jamie Ledesma Fermin. En-
coding retina image to words using ensemble of visiontransfo
rmers for diabetic retinopathy grading. F1000Research, 10,
September 2021.

[AKCS23] Chandranath Adak, Tejas Karkera, Soumi Chattopadhyay, and


Muhammad Saqib. Detecting severity of diabetic retinopathy
from fundus images using ensembled transformers. arXiv, 2023.

[BR18] Mihalj Bakator and Dragica Radosav. Deep learning and med-
ical diagnosis: A review of literature. Multimodal Technologies
and Interaction, 2(3):47, August 2018.

[CSC+ 14] Ramon Casanova, Santiago Saldana, Emily Y Chew, Ronald P


Danis, Craig M Greven, and Walter T Ambrosius. Application
of random forests methods to diabetic retinopathy classification
analyses. PLoS One, 9(6), June 2014.

[DBK+ 20] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk


Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa De-
hghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob
Uszkoreit, and Neil Houlsby. An image is worth 16x16 words:
Transformers for image recognition at scale. arXiv, October
2020.

[DDD20] Chitra A. Dhawale, Kritika Dhawale, and Rajesh Dubey. A


review on deep learning applications. In Advances in Systems
Analysis, Software Engineering, and High Performance Com-
puting, pages 21–31. IGI Global, 2020.

[DSS17] Elia J Duh, Jennifer K Sun, and Alan W Stitt. Diabetic


retinopathy: current understanding, mechanisms, and treat-
ment strategies. JCI Insight, 2(14), July 2017.

[Fed21] International Diabetes Federation. Idf diabetes atlas, 2021.

[FFAH22] Mohamed M Farag, Mariam Fouad, and Amr T Abdel-Hamid.


Automatic severity classification of diabetic retinopathy based
on densenet and convolutional block attention module. IEEE
Access, 10, 2022.

14

357
[GL17] Rishab Gargeya and Theodore Leng. Automated identification
of diabetic retinopathy using deep learning. Ophthalmology,
124(7):962–969, July 2017.

[GPC+ 16] Varun Gulshan, Lily Peng, Marc Coram, Martin C Stumpe,
Derek Wu, Arunachalam Narayanaswamy, Subhashini Venu-
gopalan, Kasumi Widner, Tom Madams, Jorge Cuadros, Ra-
masamy Kim, Rajiv Raman, Philip C Nelson, Jessica L Mega,
and Dale R Webster. Development and validation of a deep
learning algorithm for detection of diabetic retinopathy in reti-
nal fundus photographs. JAMA, 316(22), December 2016.

[HGL+ 23] Kelei He, Chen Gan, Zhuoyuan Li, Islem Rekik, Zihao Yin, Wen
Ji, Yang Gao, Qian Wang, Junfeng Zhang, and Dinggang Shen.
Transformers in medical image analysis. Intelligent Medicine,
3(1):59–78, February 2023.

[HJS22] Abid Haleem, Mohd Javaid, and Ravi Pratap Singh. An era of
ChatGPT as a significant futuristic support tool: A study on
features, abilities, and challenges. BenchCouncil Transactions
on Benchmarks, Standards and Evaluations, 2(4), October 2022.
[HSCA16] Mark B Horton, Paolo S Silva, Jerry D Cavallerano, and
Lloyd Paul Aiello. Clinical components of telemedicine pro-
grams for diabetic retinopathy. Current Diabetes Reports,
16(12):129, December 2016.
[HST+ 22] Steven A Hicks, Inga Strümke, Vajira Thambawita, Malek
Hammou, Michael A Riegler, Pål Halvorsen, and Sravanthi
Parasa. On evaluation metrics for medical applications of arti-
ficial intelligence. Scientific Reports, 12(1), April 2022.

[HZRS15] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Deep residual learning for image recognition. arXive, 2015.

[HZRS16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Identity mappings in deep residual networks. arXive, March
2016.

[IEE+ 23] Saidul Islam, Hanae Elmekki, Ahmed Elsebai, Jamal Bentahar,
Najat Drawel, Gaith Rjoub, and Witold Pedrycz. A compre-
hensive survey on applications of transformers for deep learning
tasks. arXive, June 2023.

[KOM+ 20] Yusaku Katada, Nobuhiro Ozawa, Kanato Masayoshi, Yoshiko


Ofuji, Kazuo Tsubota, and Toshihide Kurihara. Automatic
screening for diabetic retinopathy in interracial fundus im-
ages using artificial intelligence. Intelligence-Based Medicine,
3-4:100024, December 2020.

15

358
[MDB23] José Maurı́cio, Inês Domingues, and Jorge Bernardino. Com-
paring vision transformers and convolutional neural networks
for image classification: A literature review. NATO Advanced
Science Institutes Series E: Applied Sciences, 13(9), April 2023.

[NG18] Mads Fonager Nørgaard and Jakob Grauslund. Automated


screening for diabetic retinopathy - a systematic review. Oph-
thalmic Research, 60(1), January 2018.
[NLA+ 19] Katrine B Nielsen, Mie L Lautrup, Jakob K H Andersen, Thiu-
sius R Savarimuthu, and Jakob Grauslund. Deep Learning-
Based algorithms in screening of diabetic retinopathy: A
systematic review of diagnostic performance. Ophthalmology
Retina, 3(4), 2019.

[NPM+ 22] Dimple Nagpal, S N Panda, Muthukumaran Malarvel,


Priyadarshini A Pattanaik, and Mohammad Zubair Khan. A re-
view of diabetic retinopathy: Datasets, approaches, evaluation
metrics and future trends. Journal of King Saud University -
Computer and Information Sciences, 34(9):7138–7152, October
2022.

[NPS22] Onnisa Nanegrungsunk, Direk Patikulsila, and Srinivas R


Sadda. Ophthalmic imaging in diabetic retinopathy: A review.
Clin. Experiment. Ophthalmol., 50(9), 2022.

[NvGRA07] Meindert Niemeijer, Bram van Ginneken, Maria S A Russell,


Stephen Rand Suttorp-Schulten, and Michael D Abràmoff. Au-
tomated detection and differentiation of drusen, exudates, and
cotton-wool spots in digital color fundus photographs for dia-
betic retinopathy diagnosis. Investigative Ophthalmology & Vi-
sual Science, 48(5):2260–2267, May 2007.

[Ope23] OpenAI. GPT-4 technical report. arXiv, March 2023.

[PDKB22] Brittney J Palermo, Samantha L D’Amico, Brian Y Kim, and


Christopher J Brady. Sensitivity and specificity of handheld
fundus cameras for eye disease: A systematic review and pooled
analysis. Surv. Ophthalmol., 67(5), 2022.
[RLW+ 20] Serge Resnikoff, Van Charles Lansingh, Lindsey Washburn,
William Felch, Tina-Marie Gauthier, Hugh R Taylor, Kristen
Eckert, David Parke, and Peter Wiedemann. Estimated number
of ophthalmologists worldwide (international council of ophthal-
mology update): will we meet the needs? The British Journal
of Ophthalmology, 104(4):588–592, April 2020.

16

359
[RPC+ 20] Hamza Riaz, Jisu Park, Hojong Choi, Hyunchul Kim, and Jung-
suk Kim. Deep and densely connected networks for classifica-
tion of diabetic retinopathy. Diagnostics (Basel), 10(1), January
2020.

[SAFL10] Nathan Silberman, Kristy Ahlrich, Rob Fergus, and Subrama-


nian Lakshminarayanan. Case for automated detection of dia-
betic retinopathy, 2010.
[SCD+ 17] Sharon D Solomon, Emily Chew, Elia J Duh, Lucia Sobrin,
Jennifer K Sun, Brian L VanderBeek, Charles C Wykoff, and
Thomas W Gardner. Diabetic retinopathy: A position state-
ment by the american diabetes association. Diabetes Care,
40(3):412–418, March 2017.

[SKZ+ 23] Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muham-
mad Haris Khan, Munawar Hayat, Fahad Shahbaz Khan, and
Huazhu Fu. Transformers in medical imaging: A survey. Medical
Image Analysis, 88, August 2023.

[TCD+ 20] Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco


Massa, Alexandre Sablayrolles, and Hervé Jégou. Training data-
efficient image transformers & distillation through attention,
2020.

[TMS20] Borys Tymchenko, Philip Marchenko, and Dmitry Spodarets.


Deep learning approach to diabetic retinopathy detection. 9th
International Conference on Pattern Recognition Applications
and Methods, pages 501–509, 01 2020.

[TPM23] Maria Tariq, Vasile Palade, and Yingliang Ma. Transfer learn-
ing based classification of diabetic retinopathy on the kaggle
EyePACS dataset, 2023.
[TTY+ 21] Zhen Ling Teo, Yih-Chung Tham, Marco Yu, Miao Li Chee,
Tyler Hyungtaek Rim, Ning Cheung, Mukharram M Bikbov,
Ya Xing Wang, Yating Tang, Yi Lu, Ian Y Wong, Daniel
Shu Wei Ting, Gavin Siew Wei Tan, Jost B Jonas, Charu-
mathi Sabanayagam, Tien Yin Wong, and Ching-Yu Cheng.
Global prevalence of diabetic retinopathy and projection of bur-
den through 2045: Systematic review and meta-analysis. Oph-
thalmology, 128(11), November 2021.

[TW22] Tien-En Tan and Tien Yin Wong. Diabetic retinopathy: Look-
ing forward to 2030. Frontiers in Endocrinology, 13:1077669,
2022.

[UDH+ 04] D Usher, M Dumskyj, M Himaga, T H Williamson, S Nussey,


and J Boyce. Automated detection of diabetic retinopathy in

17

360
digital retinal images: a tool for diabetic retinopathy screening.
Diabet. Med., 21(1):84–90, January 2004.
[VSP+ 17] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit,
Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polo-
sukhin. Attention is all you need. arXive, June 2017.
[WHX+ 21] Jianfang Wu, Ruo Hu, Zhenghong Xiao, Jiaxu Chen, and
Jingwei Liu. Vision transformer-based recognition of diabetic
retinopathy grade. Medical Physics, 48(12), December 2021.
[WLZ18] Shaohua Wan, Yan Liang, and Yin Zhang. Deep convolutional
neural networks for diabetic retinopathy detection by image
classification. Computers & Electrical Engineering, 72:274–282,
November 2018.
[WSK+ 18] Tien Y Wong, Jennifer Sun, Ryo Kawasaki, Paisan Ruamvi-
boonsuk, Neeru Gupta, Van Charles Lansingh, Mauricio Maia,
Wanjiku Mathenge, Sunil Moreker, Mahi M K Muqit, Serge
Resnikoff, Juan Verdaguer, Peiquan Zhao, Frederick Ferris,
Lloyd P Aiello, and Hugh R Taylor. Guidelines on diabetic eye
care: The international council of ophthalmology recommenda-
tions for screening, follow-up, referral, and treatment based on
resource settings. Ophthalmology, 125(10), October 2018.
[WWC+ 22] Andrew M Williams, Jared M Weed, Patrick W Commiskey,
Gagan Kalra, and Evan L Waxman. Prevalence of diabetic
retinopathy and self-reported barriers to eye care among pa-
tients with diabetes in the emergency department: the diabetic
retinopathy screening in the emergency department (DRS-ED)
study. BMC Ophthalmology, 22(1):237, May 2022.
[YRK+ 12] Joanne W Y Yau, Sophie L Rogers, Ryo Kawasaki, Ecosse L
Lamoureux, Jonathan W Kowalski, Toke Bek, Shih-Jen Chen,
Jacqueline M Dekker, Astrid Fletcher, Jakob Grauslund, Steven
Haffner, Richard F Hamman, M Kamran Ikram, Takamasa
Kayama, Barbara E K Klein, Ronald Klein, Sannapaneni Kr-
ishnaiah, Korapat Mayurasakorn, Joseph P O’Hare, Trevor J
Orchard, Massimo Porta, Mohan Rema, Monique S Roy, Tarun
Sharma, Jonathan Shaw, Hugh Taylor, James M Tielsch, Ro-
hit Varma, Jie Jin Wang, Ningli Wang, Sheila West, Liang
Xu, Miho Yasuda, Xinzhi Zhang, Paul Mitchell, Tien Y Wong,
and Meta-Analysis for Eye Disease (META-EYE) Study Group.
Global prevalence and major risk factors of diabetic retinopathy.
Diabetes Care, 35(3):556–564, March 2012.
[YXK17] Shuang Yu, Di Xiao, and Yogesan Kanagasingam. Exudate de-
tection for diabetic retinopathy with convolutional neural net-
works. 2017 39th Annual International Conference of the IEEE

18

361
Engineering in Medicine and Biology Society (EMBC), pages
1744–1747, 2017.

[ZLLS21] Aston Zhang, Zachary C Lipton, Mu Li, and Alexander J Smola.


Dive into deep learning. arXiv, June 2021.

19

362
Using Behavioral Economics Insights to
Determine the Likely Causes of the High Rate of
Unemployment in Refugee Camps and What Can
Be Done to Alleviate It

Baraka Muhoza
October 17, 2023

Abstract
Leaving one country and relocating to another because of wars, con-
flict, and natural disasters has an impact on many different areas, includ-
ing the labor market. As a result, despite the difficulties, people strive
to adjust to their new surroundings. This study focuses on the high un-
employment rate in refugee camps, which has a wide-ranging influence on
refugees. It applies behavioral economics to investigate the likely causes of
this problem and propose various solutions that can assist in mitigating it.
In this work, we look at the role of biases in the unemployment crisis, such
as the status quo bias, anchoring bias, conformity bias, and implicit dis-
crimination, all of which are underutilized in refugee camps. For example,
refugees choose to rely on donations as a default rather than examining
other choices and have contributed to the problem’s rise. However, taking
these biases and heuristics into account, as well as applying behavioral
economics insights to the design of prospective solutions, would help to
reduce the unemployment problem.

1 Introduction
According to Emanuel Cleaver, “Hope is the motivation that empowers the un-
employed enabling them to get out of bed every single morning with unbounded
enthusiasm as they look.” [ Cle] Unemployment is one of the economic concerns
that various countries are seeking to address. It occurs when employees who
want to work are unable to find work, which has a detrimental influence on the
nation’s economy. When determining where to work or whether to look for work,
people utilize various approaches and make various decisions, demonstrating the
numerous heuristics and biases used in these situations. When the country has
∗ Advised by: Dr. Edoardo Gallo, University of Cambridge

363
a high percentage of unemployment, it has a variety of repercussions, includ-
ing a slowdown in economic activity and a fall in economic production, which
promote dependence on government spending and influence how people make
decisions. According to United Nations High Commissioner for Refugees (UN-
HCR) estimates, the number of displaced people has topped 60 million for the
first time in history [FRU16], disrupting labor markets in both host cities and
refugee camps. According to NISR and a survey conducted [MS17], the per-
centage of unemployed refugees living in Kiziba, Gihembe, and Kigeme refugee
camps was 51.15
As a result of the effects of unemployment in refugee camps described above,
several policies have been implemented to counteract the high level of unem-
ployment in the camp and the nation. First, UNHCR has provided various
alternatives, such as loans to refugees to assist them in gaining funds to es-
tablish their enterprises, and they have also enhanced education by providing
greater support in helping refugees seek higher education and find jobs. The
extent to which society reflects behavioral economics notions such as status
quo bias, conformity bias, and anchoring bias impacts whether unemployment
rises or diminishes. Using behavioral economics knowledge and skills will be a
significant step in resolving this issue since insights from behavioral economics
allow deviations from standard economic assumptions and have implications for
policy design.
Behavioral economics integrates elements of economics and psychology to un-
derstand how and why individuals behave the way they do and is concerned with
the relationship between economic agents’ rationality [MT00]. Inconsistency in
decision-making is caused by cognitive constraints, and as mentioned in Kah-
neman’s Map of Bounded Rationality: Psychology for Behavioral Economics,
“heuristics are cognitive shortcuts that the human brain develops to cope with
complex problems without calculation to make decisions easier” [Dan03].
Behavioral economics is vital in refugee camps because people are met with
problems that need to be solved and require decision-making. In refugee camps
there is work available, but people are unwilling to look for work and still wish to
rely on UNHCR funding, which causes bias and system problems. Furthermore,
behavioral economics is crucial in this case since applying different heuristics and
biases can lead to better decision-making, which benefits both the community
and the labor market.
The bodies in charge of refugees (UNHCR) and other partners (CARITAS,
INKOMOKO) working in the camps faced several challenges while attempting
to find solutions to the high rate of unemployment. First, they fail to determine
why people continue to rely on donations and why there is a low number of
people who want to take loans even though they are available, demonstrating the
inapplicability of status quo bias and anchoring bias, which leads to the design
of bad policies that do not counteract the effects of these biases. Therefore,
using behavioral economics in the labor market can lead to not only a change
in job policy that benefits the entire community by lowering unemployment,
but it can also lead to positive outcomes in decision-making that aid in the
development of self-reliance and the economy in general.

364
This paper examines the following heuristics: status quo bias, anchoring
bias, conformity bias, and implicit discrimination, which contribute to the high
unemployment rate among refugees. Status quo bias is important in this setting
since people are confronted with a variety of options. However, individuals tend
to stick with the defaults; nevertheless, in the refugee camp, there are many
alternative options beyond relying on donations, for instance, seeking work in
the camp and starting their own businesses. People, on the other hand, continue
to rely on donations which affect the outcomes and increase in dependency.
Additionally, anchoring bias is where people rely on the first piece of information
they receive, which affects the outcomes. In refugee camps, people are anchored
by their refugee status, which limits job searches and has an impact on exam
results when they are chosen, resulting in an increase in unemployment.
Moreover, conformity bias is where people will conform to what other people
are doing which affects personal decisions and outcomes. In the refugee camp,
there are many things to do such as implementing their own business, but peo-
ple will conform to what other people are doing such as relying on donations,
which affects their job search as well as initiating their business. Lastly, implicit
discrimination bias matters in the refugee camp where there are many oppor-
tunities but people cannot access them due to discrimination. First, they are
discriminated against by being a refugee and considered low-skilled labor which
makes them demotivated from applying as well as searching for jobs, which
leads to increases in unemployment, as described in this section. This paper
also makes recommendations for actions that should be taken to address this
issue, by taking effective action to address this economic issue while minimizing
biases and heuristic effects on the local community.
This essay is structured as follows. Section 2 provides background informa-
tion on Congolese refugees in Rwanda and their labor-market situation. Fol-
lowing that, Section 3 will explain how status quo bias affects unemployment in
refugee camps, Section 4 will explain how anchoring bias affects unemployment
in refugee camps, Section 5 will explain how conformity bias affects unemploy-
ment in refugee camps, and Section 6 will explain how implicit discrimination
affects unemployment in refugee camps. Section 7 will also include a solution
to the biases discussed previously. Finally, section 8 concludes with a variety
of policies that have been put in place, as well as policy recommendations that
policymakers can use to minimize or eliminate the effects of biases and heuris-
tics in producing unemployment in refugee camps and to implement practical
methods to counteract it.

2 Background and context about refugee


Every person who abandoned their home country during a conflict in pursuit
of safety in another is classified as a refugee. This began in the late 1990s
when violence amongst the population of the DRC led many of them to go to
Rwanda in quest of safety, where Congolese refugees live in a variety of camps
including Kiziba, Mahama, Mugombwa, Nyabiheke, and Kigeme, and where

365
establishing and earning a living in the destination country is a challenging
process for many refugees [Yak08]. Abandoning their homes where they could
meet their fundamental needs, where statistics show that on 1 September 2016,
UNHCR’s Rwanda office helps almost 75000 Congolese refugees [FRU16], which
shows a dramatic increase in the number of refugees. Following that, they
begin to adjust to the new situation of getting UNHCR aid to meet their basic
needs. In addition to losing their homes, they encountered several problems,
including overcoming social and economic challenges and trauma, finding work,
and managing careers after leaving their home country [CPT06].
People strive for numerous ways to battle unemployment during these terri-
ble economic times, but their efforts are unsuccessful due to the inapplicability
of behavioral economics. For example, instead of applying for loans that could
provide them with the capital they need to start their own business, they may
see their refugee status as an anchor that prevents them from doing so, prevent-
ing them from implementing their business, which may be one of the best ways
to combat unemployment, which shows the effect of anchoring bias on refugee
unemployment.
Furthermore, rather than seeking work for fresh graduates or coming up
with new ideas, they will follow in the footsteps of many unemployed individ-
uals by relying only on donations [HBL92], which inhibits job searching, and
independent decisions, and diminishes the likelihood of being hired. While all
of these factors contribute to unemployment in refugee camps, implementing
diverse behavioral economics insights will result in a drop in unemployment in
our community.

3 Status quo bias


Status quo bias is the tendency of individuals to remain at the status quo or to
stick with defaults because the disadvantages of leaving it loom larger than the
advantages [KT91]. This was demonstrated by Samelson and Zackhauser’s ex-
periment, in which they estimate the likelihood of an option being chosen when
it is the status quo or when it is competing as an alternative to the status quo
as a function of how frequently it is chosen in a neutral setting. Their findings
revealed that when an option was described as the status quo, it gained signifi-
cantly higher popularity. Furthermore, as the number of alternatives increases,
the advantage of the status quo grows. An example of status quo bias could be
shown in this test in a field setting which was performed by Hartman, Doane,
and Woo using a survey of California electric power consumers. The consumer
was asked about their preferences regarding service reliability and rates. They
were told that their answers would help determine company policy in the future.
The respondents fell into two groups, one with much more reliable service than
the other. Each group was asked to state a preference among six combinations
of service reliabilities and rates, with one of the combinations designated as the
status quo. The results demonstrated a pronounced status quo bias. In the
high-reliability group, 60.2 percent selected their status quo as their first choice,

366
while only 5.7 percent expressed a preference for the low-reliability option cur-
rently being experienced by the other group, though it came with a 30 percent
reduction in rates. The low-reliability group, however, quite liked their status
quo, 58.3 percent of them ranking it first. Only 5.8 percent of this group selected
the high-reliability option at a proposed 30 percent increase in rates [KT91].
The status quo has been a recurring issue throughout history, and it has
had an impact on both economic texts and daily life. The word ”status quo” is
widely used when people choose to do nothing or continue with their previous
decision [SZ88]. This means that when given multiple options, people will end
up choosing nothing and remaining with the defaults. Where individuals are
advised to make better decisions and to try out different options to pursue
change more effectively. Because each person in the camp has a variety of
options, they choose the ones that best suit them. However, decisions made by
locals differ from those made by people in refugee camps. For example, along the
way, people only receive monthly donations from UNHCR to meet their basic
needs, but different partners work around the camp to help refugees meet their
needs, where they provide several opportunities to people, mostly volunteering,
which can provide a small income. In this case, people opt to rely heavily on
donations, demonstrating a bias in their decision-making in favor of the status
quo while denying them access to alternate opportunities, such as working with
a partner. A significant tendency to rely on donations, on the other hand, has
an influence on the entire community as well as the next generation because it
develops dependency and discourages people from looking for work or carrying
out ideas they have.
Furthermore, looking to the other side that requires the best decision-making
that can apply perspective from behavioral economics and psychology can help
to overcome status quo bias. People in the camps look for the significance of
status quo bias, which discourages their desire in accepting or pursuing new
employment. The contribution here in the camps serves as the status quo,
where donation might be defined as the monthly amount of money that each
individual receives to assist him/her in meeting necessities [HBL92]. There is
no surprise that many refugees are dependent on humanitarian aid for everyday
survival [Hov11]. People have identified a distinction between this donation
and unemployment. Whereas this donation serves as their income [HBL92], an
increase in donations leads to an increase in unemployment because when people
receive large sums of money, they are able to meet all of their basic needs while
also spending that money on other expenses, discouraging them from looking
for or doing their jobs because they can meet their needs. Because earning from
donations will influence them from looking for a job, doing a job, or exploring
other options that will allow them to earn a lot of money, it will become the
status quo. However, they are influenced by a variety of factors, including
the labor market’s complexity, the number of available options, and how well
working conditions match workers’ preferences and knowledge. People have
limited computing power when presented with a wide number of options [LBI98].
This is not the case in the refugee camp, where possibilities are restricted in
proportion to the number of individuals who desire them, discouraging some

367
applicants. Nonetheless, instead of waiting for a few opportunities to present
themselves, they could create them by starting their own business and seeking
other assistance.
There are numerous strategies to eliminate bias induced by complexity and
misapplication of knowledge, including providing information to people in all
aspects. As previously said, there are few options available, as well as few
pieces of information offered, which necessitate cognitive work, increasing the
likelihood of individuals to adhere with the defaults. Furthermore, the low level
of education in the camp leads to a lower level of understanding about the labor
market, as well as the application and how different companies operate, which
prevents them from being hired or seeking better jobs.
The better-designed solution that can aid employment services should be
improved use of technology in the camp because there is a lot of information
online, such as the job available, and the requirement of any jobs, but they
did not access it, making them rely solely on the information that is presented
in the camp, which affects their job search. Additionally, employers should be
more succinct when presenting information to the labor, which could also help.
Therefore, advancing technology and the way information is presented in the
camp may help us lessen status quo bias, changing who can access jobs and
thereby reducing unemployment.

4 Anchoring bias
The anchoring effect was defined by Tversky and Kahneman (1974) as the dis-
proportionate influence on decision-makers to produce judgments slanted toward
an initially supplied value. This was demonstrated in a 1979 study by Tversky
and Kahneman, in which participants were asked to estimate the percentage of
African countries in the United Nations using a range of randomly generated
numbers obtained by spinning a wheel of fortune between 0 and 100. Before
making the absolute judgment, participants were asked to examine whether the
actual response was higher or lower than the reference value supplied (com-
parative judgment) [KT79]. People in the camp frequently estimate using the
beginning point or the first information they acquire to arrive at the final an-
swer. Where the starting point could be defined as the initial reference. This
also occurs in refugee camps, where a variety of information spreads through-
out the group regarding acquiring jobs and loans, but they rely on the first
information they receive to make the ultimate decision.
The anchor might be regarded quantitatively or qualitatively. In the quali-
tative component, people see the anchor as a refugee who is underprivileged or
who would experience discrimination. This will impact their decision since it
will deter them from looking for work or taking an exam when they have the
option of selecting a refugee in their application or someone else. As a result,
the best approach to change things is to set a default of not being considered
as a refugee. This could excite them and, as a result, help them compete in the
job market, thus leading to a drop in unemployment.

368
The anchor could also be quantitative, as people like to base their reservation
wage on the pay of locals, which serves as a quantitative anchor. Where they
use the salary of locals as a reference point. When referring to a scenario in
which a local teacher earns 130 per month, this could be his initial value while
looking for work abroad. It generates an anchor when two people consider doing
the same work but earning different wages. As a result, they anchor on various
values, yet because the anchoring bias has such a large impact on decision-
making, fewer people are participating in any available occupations, as well as
in the labor force.
Notably, because this anchor influences people’s decisions, one possible so-
lution for overcoming this bias could be the establishment of a new anchor
may entail collaboration among government, citizens, and non-governmental
organizations. As previously said, depending on how the anchor influences the
community, the anchor could be qualitative or quantitative. First, they will dis-
seminate information or launch a campaign to nudge people to make decisions
based on the new reference point. Concerning the anchor of a career, UNHCR
and other partners could encourage refugees to start their businesses by sharing
success stories of those who pursued unrelated careers and assisting them in
obtaining loans that will allow them to put their business ideas into action and
thus move away from various anchors.

5 Conformity bias
Conformity bias is the tendency to change one’s thoughts or behavior to fit
in with others [Nik23]. One research looked at how this was utilized in the
frenzied buying wave of culturally related goods in Korea, where names like
Canada Goose and the North Face became essential commodities that were
mass devoured. This appeared unreasonable because it was costly and imposed
a financial strain on some consumers. However, if a person does not have one,
they may face discrimination from those around them. Following a study of
Korean customers, researchers discovered that the desire for these brands was
not exclusively due to a liking for the brand or the culture that these enterprises
reflected. It was prompted by a severe ”fear of missing out” (FOMO). The
desire to belong to the mainstream group (or the fear of being excluded from
the mainstream group) was a crucial factor in the consumption of these very
popular brands [KS19].
Accordingly, there are various reasons why people conform. First, there is
informational conformity, which occurs when we seek guidance and knowledge
from a group, such as a class. Additionally, there is normative conformity,
where individuals conform to align with the public [Nik23]. This is applicable
in refugee camps, as demonstrated by the UNHCR, where a survey found that
when people were considering taking out loans, many of them held off because
they had heard that doing so might cause them to lose their refugee status or
interfere with other services that they were receiving, which affects individuals
who want to take loans to implement their business ideas, which can affect the

369
labor markets positively, leading to an increase in unemployment.
In several ways, this conformity is linked to unemployment. UNHCR has
strengthened education in the camp, where each year the number of graduates
increases but the number of jobs available does not. It only requires innovation
and creativity and putting into practice what they have studied, but because
youth want to conform to what others are doing, they are drawn to what others
are doing, which discourages job searching and increases unemployment. Fi-
nally, because there is a high rate of unemployment in the camps, young grad-
uates are discouraged from looking for work or starting their own businesses
because they want to behave like their peers, which contributes to an increase
in unemployment. There are several ways to combat this conformity bias, in-
cluding education and training that will equip them with different skills that
are needed in the labor market, as well as challenging them, which could lead
to an increase in self-awareness and thus an improvement in economic outcomes
where people will make decisions based on what affects them. Furthermore,
developing uniqueness in decision-making would encourage people to start their
businesses, resulting in a reduction in unemployment in the refugee camp and
an increase in self-reliance.
Furthermore, using insights from behavioral economics could be one of the
ways to combat these biases where they can change the way they present in-
formation to large groups of people because this affects their decision making
but teaching them and collaborating with different institutions could result in a
better outcome because people will be telling other reliable information which
can reduce conformity.

6 Implicit discrimination
People may have skewed opinions and consciously discriminate for a variety of
reasons. For example, characteristics of a specific group, status, and production.
The implicit association test, which depends on the test taker’s speed of reaction
by linking names, words, and images to reflect the strength of the unconscious
mental association, can be used to quantify implicit discrimination [GS98].
Implicit discrimination has an impact on the issue of unemployment in
refugee camps. To begin with, there is a low possibility that refugees will be
hired outside of the camps due to a lack of Rwandan identification, which has an
impact on the job market and has influenced individuals who wish to apply for
any jobs that are advertised there. Second, because the level of education in a
refugee camp is low and they are unable to acquire a university-level education,
employers regard them as low-skilled labor, preventing them from hiring them
and increasing unemployment.
Implicit discrimination is a significant driver of discriminatory behavior in
the job market. There is significant racial inequality in the host country for
refugee camps, where it is unlikely that refugees will be hired due to a lack of the
required identification card as a citizen, which has an impact on the job market
because they will be demotivated by this discrimination, which will discourage

370
their job search. Refugee employees struggle to obtain work and perform poorly
in the labor market as a result of implicit bias. Implicit discrimination occurs as
a result of ambiguity caused by faulty information supplied or how an employer
perceives an individual, as well as the expectation of immigrant labor as low-
skilled labor, bad work ethics, and low productivity. According to Bertrand and
Duflo [BD16], when individual information is scarce, group participation might
provide useful information about predicted productivity.
To address this issue, various criteria should be addressed in the hiring pro-
cess. To begin, using technology in recruiting systems where they will enter
candidate addresses as well as qualifications into a computer, and then the
computer will determine who to provide a job based on abilities and capability,
which might lead to a reduction in discrimination.
Finally, there is unfairness in the financial sector [KT86] since there is a low
likelihood that a refugee will be able to receive loans, reducing the availability
of cash to build their own firm and so lowering the degree of unemployment.
Using behavioral economics insights can lead to better discrimination treatment
in labor economics, which can lead to improved economic outcomes. When there
are numerous opportunities accessible, but people are unable to take advantage
of them due to bias in the labor market, implicit discrimination has an impact
on results. Advocating and increasing awareness may aid in the elimination of
certain biases.

7 Proposal solution to mentioned biases and


heuristics
7.1 Job search assistance and employment service
There are numerous solutions to the aforementioned problem of unemployment
in refugee camps. And one of them is job search support, which helps people seek
for and locate jobs. This could be supported by various interconnected programs
that operate both inside and outside of the camp, such as when UNHCR, in
collaboration with Kepler, began assisting graduates from university and high
school to find jobs by creating a platform for refugee alumni that equips them
with various job vacancies available. Also, CARITAS, a partner working with
UNHCR, assists refugees with counseling and job search assistance, with workers
gaining access to services through various points.
There are various barriers to job searching, such as setting pay expectations
and procrastination in looking for work, thus these measures will help to over-
come the problem of job search and thereby reduce unemployment. According
to behavioral economics, individuals have limited attention and computing ca-
pacity, which causes a variety of issues ([TS92], [TK74], [IL00]) As a result,
this training may help individuals handle complexity, increasing their chances
of finding work. Designing a peer-to-peer solution in which refugees discuss ex-
isting vacancies, as well as a well-designed job search, could help to overcome
these obstacles. Furthermore, people in the camp are not good at estimating the

371
value of a job search, but this job assistance is worthwhile. Evidence suggests
that people are weak at recognizing whether a search is effective, or that they
undervalue the value of a search [Spi10].
Because it suggests policy on how individuals understand the possibilities
before entering the job market, job search assistance is one of the better ways
to overcome biases such as status quo bias and anchoring, which are obstacles to
employment. In addition, officials can innovate in how information is delivered
to citizens. A behavioral obstacle to job search and employment is that individ-
uals may have biased salary expectations, which can be debiased by carefully
designed interventions [BI97].
Thus, incorporating insights from behavioral economics should be impor-
tant because it suggests that the way job options are framed might affect how
individuals respond to the choices.

7.2 Job training


Job training, which is crucial in combating unemployment, is one strategy that
can be used to offset this prejudice. Job training is where people are taught
how to construct their own jobs. This has been demonstrated by UNHCR
collaborating with other groups to provide training on how to establish their own
business as well as how to properly manage money. In addition, they provide
loans to them, which aid in the implementation of their business ideas, resulting
in a reduction in unemployment. Furthermore, this job training enables them to
acquire skills demanded by the firms by teaching them how to build experience
as well as how to build different skills by taking online courses or attending
different sessions that help them to build these skills and thus increase their
chances of being hired.
According to behavioral economics, the failure or disappointing results of
some job training programs may be due to a failure to select people who could
benefit from training [LBI98]. This is also demonstrated in the camp where
people will just enroll in training for the sake of getting money that they offer
thus didn’t benefit them and do not put into reality what they got. As a re-
sult, a successful one lowers complexity and the necessity for others’ willpower.
Furthermore, improving counseling services in the camp that can accompany
training provided in the camp, which could be beneficial in reducing unemploy-
ment. Finally, the emphasis on simplifying the user experience in job training
may increase people’s participation in the training.
Finally, by applying insights from behavioral economics to job training, peo-
ple are more likely to apply for and work in unrelated careers or unfamiliar jobs,
which can lead to a decrease in unemployment as well as an improvement in
workers’ understanding of the labor market, both of which are important.

7.3 Framing
Framing is defined as the underlying or combined effect of both status quo prej-
udice and anchoring bias on how individuals see information, where framing

10

372
adjusts one decision either positively or negatively, and where multiple options
can be framed in different ways. This is also applied in refugee camps, as I
mentioned earlier, people frame loans differently, which causes many to react
differently, affecting their implementation of their business idea as well as con-
cerning refugee status, where people can’t apply because they know there is a
lower probability that a refugee will be hired, which can be considered as a loss.
On the other hand, it may be presented as a benefit if, in a given situation,
they offer a refugee extra opportunity for employment while also pushing the
refugee to pursue employment, in which case it would be gain. [LBM12] Fram-
ing influences choices by working within the constraints of biases and behavioral
habits rather than overcoming them. One area of study reveals how job search
assistance is structured. For example, framing losses as consequences rather
than advantages has been found to alter behavior in a variety of circumstances
[Rot06]. As previously said, framing influences an individual’s proclivity to take
risks, such as applying for loans, starting a new career, or doing unrelated work.
Several approaches may be taken to address this, including modifying the
way information is presented as well as employment counseling services where
salaries or pay information might drive people to apply or seek work elsewhere,
which can lead to a drop in unemployment. Furthermore, taking into account
the context and language in which information is presented to people where
framing can reverse the choice preference where information could be presented
in the right place as well as in a better which can cause people to move far
from their anchor if a different approach is used where they refer to remaining
unemployment as framing as well as cutting down on the donation that they
receive, which could drive individuals to start new occupations, either by begin-
ning their own business or hunting for one, and thus develop a positive attitude
regarding job chances while also promoting job search.

7.4 Peer-to-peer solution


Peer-to-peer solutions entail one refugee assisting another. This can be applied
in different ways. First, by changing the defaults where refugees have different
abilities apart from relying on donations, some of them are carrying out eco-
nomic activities including trade, which caused them to shift their defaults from
donation which can motivate other refugees to do the same as a result reduce
the unemployment problem. In addition, due to the high rate of unemployment,
people are demotivated from job search because they are conforming to what
other people are doing, but they are other people who have started their busi-
nesses from little savings, which has a good influence on their lives, and they are
collaborating; hence, this creates another conformity in a positive way where
many people imitate what those people are doing by beginning their enterprises.
Furthermore,[Ami06] and [Gra05] discovered that networking with refugees
from the same country increases refugees’ employment opportunities and access
to credit. For example, refugees from Sudan in Cairo frequently find employment
with Egyptian- Sudanese business owners who prefer to hire them [Gra06]. This
also applies to the peer-to-peer solution, in which refugees network with other

11

373
refugees in different countries to share opportunities and experiences, causing
refugees to move far away from their anchor that refugee can’t access loans, so
by networking refugees will be able to lend money among themselves, easing the
implementation of their business idea, and as a result, many businesses will be
opened.

8 Conclusion
A review of insights from behavioral economics and labor market policies sug-
gests various policies that can be used to combat the issue of unemployment in
refugee camps, where it brought out various solutions such as job search assis-
tance that could help in overcoming various biases that are presented, job train-
ing, and finally framing because these biases include implicit discrimination,
anchoring, conformity, and status quo bias are contributing to the high rate.
Where several frameworks have been put in place to reform existing policies
for reducing unemployment and establishing new policies. This study collected
and applied experimental investigations and concepts from previous works of
literature and practical research. They all demonstrate how heuristics influence
decision-making as well as economic outcomes. So, while the proposed reforms
and adjustments in job training, job search assistance, and framing are limited,
there may be other behavioral approaches that can be used to improve employ-
ment and the labor market in refugee camps. My future research will focus on
evaluating existing policies and new solutions to unemployment.

References
[Ami06] Baruti Amisi. An exploration of the livelihood strategies of durban
congolese refugees. Geneva: UNHCR, 2006.
[BD16] Marianne Bertrand and Esther Duflo. Field experiments on discrimi-
nation. Handbook of economic field experiments 1, 2016.

[BI97] George Loewenstein Babcock, Linda and Samuel Issacharoff. Creating


convergence: Debiasing biased litigants. Law Social Inquiry 22.4,
1997.
[Cle] Emanuel Cleaver. www.brainyquote.com/authors/emanuelcleaverquotes: :text=hope
BrainyQuote.
[CPT06] Val Colic-Peisker and Farida Tilbury. Employment niches for recent
refugees: Segmented labor market in twenty-first century australia.
Journal of refugee studies 19.2, 2006.

[Dan03] Kahneman Daniel. Maps of bounded rationality: Psychology for be-


havioral economics. American economic review 93.5, 2003.

12

374
[fRU16] United Nations High Commissioner for Refugees (UNHCR) (2016).
Global forced displacement hits record high. 2016.
[Gra05] Katarzyna Grabska. The analysis of the livelihood strategies of su-
danese refugees with closed files in egypt. cairo, egypt. American
University in Cairo, 2005.
[Gra06] Katarzyna Grabska. Marginalization in urban spaces of the global
south: Urban refugees in cairo. Journal of refugee studies 19, 2006.
[GS98] Debbie E. McGhee Greenwald, Anthony G. and Jordan LK Schwartz.
Measuring individual differences in implicit cognition: the implicit
association test. Journal of personality and social psychology 74.6,
1998.
[HBL92] Eftihia Voutira Harrell-Bond, Barbara and Mark Leopold. Counting
the refugees: gifts, givers, patrons, and clients. Journal of Refugee
Studies, 1992.
[HD18] Asad Sadiq Hameed, Sameena and Amad U. Din. The increased vul-
nerability of refugee population to mental health disorders. Kansas
journal of medicine 11.1, 2018.
[Hov11] Hovil. The dilemmas of congolese refugees in rwanda. citizenship and
displacement in the great lakes region. International Refugee Rights
Initiative, 2011.
[IL00] Sheena S. Iyengar and Mark R. Lepper. When choice is demotivating:
Can one desire too much of a good thing? Journal of personality and
social psychology, 2000.
[KS19] Haixin Cui Kang, Inwon and Jeyoung Son. Conformity consumption
behavior and fomo. Sustainability 11.17, 2019.
[KT79] Daniel Kahneman and Amos Tversky. Prospect theory: An analysis
of decision under risk. Econometria, 1979.
[KT86] Jack L. Knetsch Kahneman, Daniel and Richard Thaler. Fairness
as a constraint on profit seeking: Entitlements in the market. The
American economic review, 1986.
[KT91] Jack L. Knetsch Kahneman, Daniel and Richard H. Thaler. Anoma-
lies:the endowment effect, loss aversion, and status quo bias. Journal
of Economic Perspectives 5.1, 1991.
[LBI98] George Loewenstein Linda Babcock and Samuel Issacharoff. Creatin
convergence: Debiasing biased litigants. Law and Social inquiry, 1998.
[LBM12] Lawrence F Katz Linda Babcock, William J Congdon and Sendhil
Mullaninathan. Notes on behavioral economics and labor market pol-
icy. IZA Journal of Labor Policy, 2012.

13

375
[MS17] Craig Loschmann Marchand, Katrin and Melissa Siegel. Forced mi-
gration and labor market outcomes. The Case of Congolese Refugees
in Rwanda, 2017.

[MT00] Sendhil Mullainathan and Richard H. Thaler. Behavioral economics.


2000.

[Nik23] Kassiani Nikolopoulou. What is conformity bias?—definition exam-


ples. Scribbr, 2023.

[Rot06] et al Rothman, Alexander J. The strategic use of gain-and loss-framed


messages to promote healthy behavior: How theory can inform prac-
tice. Journal of communication 56.suppl1 , 2006.

[Ruz21] Yvette Ruzibiza. They are a shame to the community. . . ’ stigma,


school attendance, solitude and resilience among pregnant teenagers
and teenage mothers in mahama refugee camp, rwanda. Global public
health 16.5, 2021.

[Spi10] Johannes Spinnewijn. Employment and social protection. 2010.

[SZ88] William Samuelson and Richard Zecjkhauser. Journal of risk and


uncertainty,. Journal of Risk and Uncertainty, 1988.

[TK74] Amos Tversky and Daniel Kahneman. Judgment under uncertainty:


Heuristics and biases: Biases in judgments reveal some heuristics of
thinking under uncertainty. science, 1974.
[TS92] Amos Tversky and Eldar Shafir. Choice under conflict: The dynamics
of deferred decision. Psychological science 3.6, 1992.

[Yak08] Backhaus A. Watson M. Ngaruiya K. Gonzalez J Yakushko, O. Career


development concerns of recent immigrants and refugees. Journal of
Career Development, 2008.

14

376
Are Champions Born Or Made?

Yashvendra Singh
October 18, 2023

Abstract
Champions hold the world’s attention and their performances both
inspire and generate curiosity. Whether they are born champions or are
the product of scientific training mechanisms and tremendous hard work
is a debate that rages on with every convincing victory that throws up
an invincible winner. Sporting history is replete with examples where the
sporting fraternity was forced to research characteristics and traits that
marked their invincibility. Some studies showed that the complete domi-
nation of Kenyan and Ethiopian runners in the middle- and long-distance
events and Usain Bolt’s phenomenal success could also be attributed to
their higher haemoglobin and slow twitch muscle fibres suited for en-
durance running and speed. Many believed that Michael Phelps’s wider
wingspan, and unique genetic disposition of producing less lactic acid gave
him an unfair advantage over his competition. There are many such ex-
amples that keep bringing back the question – are champions made or
born? The more pragmatic researchers who emphasize on scientific train-
ing, hard work and personal motivation too have not been able to dismiss
the role of genetic predisposition. Given the level of competition and
hard work that these champions endure to become winners makes this
an interesting case study. This paper analyses the complex interplay be-
tween the roles played by genetic disposition and training in an athlete’s
performance.

1 Introduction
The impact of genetics on sports performance is a hugely contentious debate in
the sporting fraternity. While some like Michael Phelps were hailed as super-
natural and genetically blessed because of his unusually wide wingspan, double-
jointed ankles and his physical distinctiveness wherein his body apparently pro-
duced half the lactic acid as compared to his fellow competitors, which gave him
a huge biological advantage over his fellow athletes, others like Caster Semenya,
the two time Olympic champion from South Africa, became the subject of con-
troversy. Her body allegedly produced higher testosterone levels than most
∗ Advised by: Bridget Callaghan

377
women–a finding that led the Court of Arbitration for Sport to rule that she
would have to lower her testosterone levels through medication to compete in
the women category [Ing19,SEM16], making her a prominent face in the annual
list of ”50 People That Matter” for unintentionally instigating ”an international
and often ill-tempered debate on gender politics, feminism, and race, becoming
an inspiration to gender campaigners around the world” in the 2010 edition of
the British magazine New Statesman.
The absolute domination of the Kenyan long-distance runners is another
trigger that sparked the debate on genetic endowment. Physiological advan-
tages of Africans have recently been studied by Weston et al, whose studies
revealed that “Africans had elevated citrate synthase and 3-hydroxyacyl CoA
dehydrogenase activity and enhanced resistance to fatigue in a treadmill trial
designed to imitate the stresses involved in 10 km running”. [AR99]. They also
demonstrated lower blood lactate concentrations at higher speeds. Another
study revealed that they had relatively higher haemoglobin and haematocrit,
metabolic efficiency and helpful skeletal-muscle-fibre composition and oxidative
enzyme profile that gave them the advantage over equally motivated and trained
athletes. [WR12] Research has shown that one of the main factors that con-
tributes to strength/power which is essential to be a sports champion is also
biomechanically based, highlighting genetics once again. Since most sports de-
mand agility and brute force, “joint torque – this is how fast and/or powerfully
a joint can move based on the force that a muscle applies to it” is important.
[Coy07]. This enhanced joint torque helps an athlete generate greater power
and speed in rotational movements, helps in maintaining better balance, en-
hances precision in movement towards a given goal/target even at odd, angled
movement, enhances endurance and facilitates faster recovery; all of which are
crucial for an athlete’s performance. [Mus23] Interestingly, orthopaedic research
on reconstruction of joints and/or soft tissue attachments has shown that at-
tachment site of a tendon is a crucial determinant of the range of motion of a
joint and joint torque at various positions [Yam07,Miz23 ]. These muscle at-
tachment mechanisms and positions are all genetic. Strength and endurance is
also dependent on the muscle fibre type [Tes85]. It is proven that fast twitch
muscle fibres produce more force and power than slow twitch fibres – primarily
because they are larger in size, giving players with the former a genetic advan-
tage especially in sprinting. This was often one of the attributes that supposedly
made Usain Bolt unstoppable and matchless. [DLCS76]. The type of muscle
fibre may have a direct bearing on the athlete’s performance. For example, the
slow- support long distance runners and the fast-twitch support quick, powerful
movements needed for sports like sprinting or weightlifting. [TP85,ME19]
There is empirical evidence of the fact that “professional bodybuilders more
than likely have some sort of myostatin mutation that allows them to build and
maintain such muscle mass”. [Sch04] Furthermore, the research findings that
elite marathon runners are simply better at dissipating heat than other runners
due to efficient tendon hysteresis and have higher maximum oxygen capacity
again takes us back to genetic predisposition [FEMNes, MD94] .
The genetically blessed dilemma has always stayed enigmatic. Let’s use India

378
as an example as it has the largest population of young people in the world. Are
Indians genetically better at excelling in chess than soccer and Basketball? The
query assumes increased pertinence as an 18-year-old Indian Chess grandmaster,
Rameshbabu Praggnanandhaa takes on Magnus Carlsen in the final of the FIDE
world cup at Baku, Azerbaijan. He is an exceptional talent, motivated by
the likes of fellow Indians like Vishwanathan Anand who himself has been a
champion earlier. While some might wonder why the most populous country
with a population of 1.4 billion has never qualified for Soccer or Basketball at
the Olympics/world cup and has only a handful Olympic medals in athletics,
others might argue how India’s cricketing prowess also throws confusing signals,
where two of its most renowned players Sachin Tendulkar and Sunil Gavaskar
became legends in their craft despite very small physical frames. They were
known to take on the might of some of the fastest physically well-endowed
bowlers from other cricketing nations. The world is replete with such examples
with footballers like Lionel Messi and Deigo Maradona making it to the very
top despite a shorter frame, belying the genetics argument to some extent.
Interestingly in the same vein, while on one side we see a distinctive edge
enjoyed by black athletes in all sports requiring high speed and force and the
complete domination of black athletes over their Caucasian counterparts in the
popular NBA, we see Asians and Caucasians dominate racket sports like Ten-
nis and badminton which also require high levels of agility and brute force.
Such contradictions trigger a counter argument that genetic predisposition is a
significant but not the sole prerequisite for excelling in a particular sport.
With the advent of technical tools to make sports training more scientific,
this debate leads to a larger debate of sports genomics. We attempt to analyse
whether champions can be trained and made or if they need to have a certain
genetic predisposition for training to yield the desired results. Add to this the
role played by hard work, motivation, and the role of a supportive team in the
athlete’s success as expanded by Ericsson’s theory of deliberate practice and
its significance in champion development lends an interesting dimension that
cannot be ignored. [Eri93]

2 The Key Fundamentals of Sporting Excellence:


2.1 Physical
The physical qualities of the athletes and players are among the most frequently
studied contributing factors of performance [Fal04] (e.g., Falk, Lidor, Lander, &
Lang, 2004). Sports is built on a strong physical foundation. It is an unavoid-
able principle. Unfortunately, sports training has become a lucrative business
industry of trillions of dollars where parents and athletes are given false as-
surances and introduced to a tough regimen that they cannot keep up with.
The parameters of physical fitness may vary based on the sport. For example,
one can’t have a diminutive basketball player, but the same player might turn
out to be lethal in sports like soccer/ cricket/golf where height is not the only

379
advantage. Recreational sport may be for everyone, but competitive sports at
any level involves fierce competition and one must have the physical endurance
to take on the challenge. This prerequisite can be ignored for sports like chess
and other board games that do not involve any physical activity. This is not
as simple and straight forward as it seems as different sports have different
physical requirements. As Vaeyens et al. (2008) argued, the nature of the
sport discipline itself defines to what extent the uni dimensional components
intervene [Vae08]. Moreover, even within specific sport disciplines, the physi-
cal requirements will vary greatly, depending on the position of the players on
the field; This position-specific adaptation has been observed for various sports,
including volleyball [She09] (Sheppard, Gabbe, & Raebery, 2009), handball
[Zap11,Del13] [Zapartidis, Kororos, Christodoulidis, Skoufas, & Bayios, 2011],
and rugby [Delahunt et al., 2013]. These studies reveal that the specifics may
vary. For decades, coaches have obsessed with the “the tale of the tape,” to
measure height, weight and reach to determine a player’s suitability at a com-
petitive level. Now new research out of UC Berkeley suggests that the relative
length of an athlete’s arms to their height might be even more important than
previously believed in sports like NBA, [Bah18] making the term “wingspan”
a key element in NBA. The same advantage was exploited by Michael Phelps
to perfection in his sporting career. Despite variations in the basic sports type
and role, the one thing most studies agree with is that physicality matters in
sports.

2.2 Technical
Technique in any sport is important. Grosser, M. (1982) defines technique as
the ideal model of a movement relative to a specific sport activity. [Roc86] It
refers to the methodology adopted in terms of movements and postures to max-
imise impact, optimize performance, prevent injuries, ensure consistency under
pressure with minimal wastage of effort and force. This technique is the key
to success in sports. These techniques are crucial and are worked on very scien-
tifically and personalised for champions after careful analysis of their physical
attributes and natural abilities and strengths. Michael Phelps’ perfecting the
deep catch or sculling technique to propel him faster and Michael Jordan us-
ing biomechanics to perfect his famous fadeaway are all examples of sportsmen
perfecting techniques to get a competitive advantage.
Michael Jordan’s accomplishments are attributed to his brilliant athleticism
and superior technique. “His planted foot was attached to the floor, making it
easier for him to explode away from his defender at just the right moment”–a
small example of a technique used to perfection.“While others rely on instinct or
muscle memory to make their shots in moments of white-hot pressure, Federer
can delay the moment when he must commit to a shot until impossibly late”
[Fyl09]which is another example of a technique used to perfection by the leg-
endary tennis player Roger Federer.

380
2.3 Tactics and Strategy
Brute force and physical attributes mean nothing without a refined skill set.
That is why athletes spend time adding tactical elements to their training and
work on a winning strategy. Athletes have to have a comprehensive understand-
ing of the strategic aspects of the game and how these strategies withstand the
test of a real time game/competition and not just be secure within the precincts
of their training arena. For this they don’t only plan and strategize for them-
selves but also carefully analyse the strategy of their competitors to ensure that
they can have the winning edge by keeping in mind all contingencies and the
counter for it. This is crucial as it “requires players to maintain high quality
of perception, concentration and decision-making for a long time, even when
the player is physically and psychologically overloaded”. [PP18] “Tactics there-
fore elaborates the strategic intention of preparing the player or team in real
conditions of a match and solving situations in match. Tactics point to the pos-
sibilities of solving certain sub-situations within the strategy. It focuses on the
practical implementation of such situations in the match”. [SO18] Tactical
preparation is the process of equipping a sportsman with knowledge, practical
learning and skills that enable the player to choose the optimal solution in each
game situation and apply it effectively. This is crucial for success.

2.4 Psychological Power


Winning is also a mind game. It takes a strong mind and personality to with-
stand the pressure of competitive sports and to remain focussed and determined
in a real time competition that has its fair share of unpredictable variables.
“The psychological factor is usually the determinant that differentiates a win-
ner and a loser in sports” [Bre09] (Brewer, 2009). In studies conducted by
Gould, Dieffenbach, & Moffett (2002) [Gou02] involving ten Olympians, it was
reported that “mental toughness is one of the highest ranked psychological
characteristics that determine a successful performance”. Given the levels of
competition at the highest level, it comes as no surprise that Athletes, coaches,
and applied sports psychologists have consistently referred to mental toughness
as one of the most important psychological characteristics related to outcomes
and success in elite sport.
Grit, optimism, resilience, and perseverance are traits that set apart cham-
pions from the others. They are required to be mentally tough to be able
to sustain the pressure of competition and have to be tough to keep their
self-determined motivation and optimism regardless of the variables and chang-
ing dynamics of the environmental and surrounding factors.

381
3 Impact of Genetics on the Key Fundamentals
of Sporting Excellence:
3.1 Impact of Genetics on Physical Force and Strength:
It is widely acknowledged that a favourable genetic profile, when combined with
an optimal training environment, is important for elite athletic performance.
[GL13]. As of 2009, more than 200 genetic variants have been associated with
physical performance, with more than 20 variants being associated with elite
athlete status [BM09] and given the extremely slim margins between victory
and defeat. This is undoubtedly a substantial advantage to possess, ensuring a
favourable head start.
Key basic physical traits like height, which is critical for success in some
sports, is highly heritable, with about 80
The tremendous success of many Kenyan athletes has brought back the
focus to the role of genetics in a sportsman’s success. Studies have shown
that African distance runners have reduced lactic acid accumulation in muscles,
increased resistance to fatigue, and increased oxidative enzyme activity, which
gives them the advantage of high levels of aerobic energy production.[WAik]
Larsen et al., (2015) studied the anthropometric characteristics of elite Kenyan
distance runners and reported that they had longer legs (5
The dynamic cyclist Miguel “Big Mig” Induráin won five Tours de France
from 1991 to 1995 and the Giro d’Italia twice was known to have a remarkably
huge lung capacity and an exceptional heart that allowed his blood to transport
7 liters of oxygen throughout his body per minute compared to 3 to 4 liters
pumped in an average individual.
Basketball greats like Michael Jordan whose 6’6” frame was bestowed with
a wingspan of 6’11”, used his reach to a completely different level. Dwight
Howard’s wingspan of 7 feet 5 inches with a 6 feet 11 inches tall frame made
him formidable; at 7 feet 1 inch tall and 325 pounds, Shaquille O’Neal is a with
size 22 feet used his overpowering physical assets to dominate the court and so
does LeBron James who stands 6 feet 8 inches tall and weighs 250 pounds. His
massive legs allow him to make 700 pounds of pressure per leap making him
faster than most other point guards [Hay17].

3.2 Impact of Genetics on The Technical, Tactics And


Strategy Aspect Of Sports
Technique in any sport is important. Grosser, M. (1982) defines technique as the
ideal model of a movement relative to a specific sport activity. [Roc86] As men-
tioned earlier it refers to the methodology adopted in terms of movements and
postures to maximise impact, optimize performance, prevent injuries, ensure
consistency under pressure with minimal wastage of effort and force.
Agility, speed, force, endurance are key factors to perfecting technique and it
is apparent above that genetics has a crucial role to play in all of them. Athletic

382
performance is a complex mix of both genetic and environmental factors. Since
every movement and technique is greatly impacted by the physical traits and
the strength of muscles used for movement (skeletal muscles) and the predomi-
nant type of fibers that compose them, genetics again becomes a focus. These
muscle fibres can’t be created artificially and are nature’s gift. These fibres are
primarily of two types, Slow-twitch muscle fibers contract slowly and can work
tirelessly for a longer duration and hence are an asset for any sport that needs
endurance. Fast-twitch muscle fibers contract quickly but tire rapidly; these
fibers are good for sprinting and other activities that require power or strength.
Other traits that have a direct bearing on whether a trained sportsman can
stick to technique in a high pressure competitive environment is also related
to aerobic capacity, muscle mass, height, flexibility, coordination, intellectual
ability, and personality; all of which have a direct genetic connection.
Basketball and soccer are two of several combination anaerobic and aerobic
sports in which athletes need power, speed, quickness, agility, and strength
[NSC17] and studies have revealed how genetic composition can have a direct
bearing with them. There is no doubt that a motivated athlete could train
harder to overcome odds and defy the genetic advantage of an opponent, but
champions need only that fraction of an advantage to take the lead sometimes,
and that minuscule advantage might be the defining difference.
The ability to generate maximal power during complex motor skills is of
paramount importance to successful athletic performance across many sports
[Cor11] and has a direct bearing on their ability to implement technique to
perfection. This is why there is emphasis on power training to improve maxi-
mal power production in dynamic, multi-joint movements. Muscle strength is
directly related to its fibre composition and hence genetics comes in. Studies
have been consistent in their findings to indicate the significant role of genes in
the way an individual’s body responds to exercise and strength training which
have a direct bearing on whether an athlete can execute a given technique to
perfection. A recent study found that up to 72
Technique, tactics, and strategy are perfected through training and other
factors like diet and nutrition. Research on aerobic endurance shows that some
people respond more to training than others. Genetically gifted athletes are
likely to respond better to training as compared to equally motivated less ge-
netically blessed athletes and their bodies are likely to have increased number of
mitochondria in cells that produce Adenosine triphosphate (ATP), the source of
energy usage and storage at the cellular level [MJ19a]. Tactics and strategy are
another key pillar of sporting success. A powerful athlete can implement this
strategy with brute force and impeccable precision and is also capable of destroy-
ing that of his opponent however well prepared. Any race/match is won because
the winner has the capacity to outdo and outperform the tactics and strategy of
his/her opponents. All athletes at the highest level come with the highest levels
of training and motivation, as one cannot compete at the highest level without
it. In the face of this intense competition, studies focused on similarities and
differences in athletic performance within families, including between twins, sug-
gesting that genetic factors underlie 30 to 80 percent of the differences among

383
individuals in traits related to athletic performance[AI15,AI16,WN15,YX16],
which is percentage that cannot be ignored.

3.3 Impact of Genes On Psychological Power


Athletes’ success or failure is not unidimensional rather it is multifactorial; it
is a combination of multiple factors including physical, tactical, technical, and
psychological factors. The psychological factor is usually the determinant that
differentiates a winner and a loser in sports [Bre09a]. (Brewer, 2009). Studies by
Weinberg and Gould (2003) [Wep03] indicated that mental ability contributed
over 50
Clough et al. described mentally tough individuals “as tending to be socia-
ble and outgoing as they can remain calm and relaxed, they are competitive
in many situations and have lower anxiety levels than others. With a high
sense of self-belief and an unshakable faith that they control their own destiny,
these individuals can remain relatively unaffected by competition or adversity.”
[CP12a,CP02,CP12b]. It is a well known fact that mental/brain health, like our
physical health is a complex interplay of genetics, epigenetics, and behaviour.
Scientists estimate that 20 to 60 percent of our temperament is determined by
genetics and their complex variations or (polymorphisms). For example, vari-
ants in the DRD2 and DRD4 genes have been linked to a desire to seek out new
experiences, and KATNAL2 gene variants are associated with self-discipline and
carefulness. [BD17,PR15] Genes like the; SLC6A4, AGBL2, BAIAP2, CELF4,
L3MBTL2, LINGO2, XKR6, ZC3H7B, OLFM4, MEF2C, and TMEM161B are
known to contribute to anxiousness or depression. Researchers also point to the
genetic variation called ADRA2b which influences the neurotransmitter nore-
pinephrine , and is linked to intense emotional responses and sensitivity . Given
the significance of psychological factors on sports, one cannot ignore this crucial
genetic factor.

4 Deliberate Practice and Expertise Acquisi-


tion
There is a counter belief encouraged by increasing scientific application in the
training of elite athletes that ace athletes can be nurtured and even if they
do not have special natural advantages. There is a relatively widespread belief
that if individuals are innately talented, they can easily and rapidly achieve
an exceptional level of performance once they have acquired basic skills and
knowledge.
Schulz and Curnow (1988) drew attention to the fact that the performance
of players over the Olympic history timeline had only improved in some cases
by more than 50
Drawing a difference between regular performers and champions, these stud-
ies show that with more practice and experience, salient mistakes become in-
creasingly rare, and “everyone’s performance eventually reaches an acceptable

384
standard where the need for effortful concentration is minimised”. If the indi-
vidual persists and learns to adapt to situational demands, a stage could come
when the tasks become increasingly automated and the individual could stop
making intentional adjustments. This is where the ace or expert performers are
different as they do not stop the learning curve. “Expert performance continues
to improve as a function of more deliberate practice” [Eri03a]. “The challenge
for aspiring expert performers is to avoid the arrested development associated
with automaticity and to acquire new cognitive skills through their continued
learning and improvement”. By persistently practicing and harnessing one’s
unique talents, “this modification of complex cognitive mechanisms demands
problem-solving skills and undivided concentration” [Eri02]. The key challenge
is to be persistent with deliberate practice and to continue to pursue perfection
in all eventualities with a focussed cognitive approach. Ericsson believed that
“As a result of deliberate practice, many biological characteristics, such as width
of bones, flexibility of joints, size of heart, metabolic characteristics of muscle
fibers, and so forth, can be changed after years of intense and carefully designed
training. Biochemical processes that preserve equilibrium during intense train-
ing influence these anatomical changes” [Eri03b]. (Ericsson, 2003c). Deliberate
practice also helps the expert performers sharpen their “mental representations
that allow the expert performer to bypass the information-processing constraints
imposed by basic capacities” he added. Taking this cue, it could be inferred that
the exemplary reaction mechanisms and superior force and speed exhibited by
ace athletes’ elite athletes in like returning tennis ball of an opponent, can be
attributed to skilled anticipation of events by identification of early predictive
cues and not by superior perceptual acuity or faster cognitive speed alone [ea08].
This theory of deliberate practice also stresses on starting early to give the
potential players the years of practice needed to be an ace sportsman. “In many
domains, such as music and sports, parents arrange for their children to start
practice at very young ages, sometimes as young as 3–4 years of age”. This
early start gives them a huge advantage as they can sharpen their skill sets
with deliberate practice as compared to the late starters. Studies have shown
that beginning deliberate training early in life yields more refined and accurate
adaptive responses and greater cognitive and neurological development.
“The foundations of brain architecture are established early in life through
a continuous series of dynamic interactions between genetic influences, environ-
mental conditions, and experiences” [Fri06,MM06]. This phase has a significant
impact on the brain architecture and “each one of our perceptual, cognitive,
and emotional capabilities is built upon the scaffolding provided by early life
experiences” [GL15].
According to Benjamin Bloom, a professor of education at the University of
Chicago, and author of the book “Developing Talent in Young People”, which
examined the critical factors that contribute to talent, “all brilliant performers
had practiced intensively, had studied with devoted teachers, and had been sup-
ported enthusiastically by their families throughout their developing years His
stud included the retrospective look at the childhoods of 120 elite performers
who had won international competitions or awards in fields ranging from music

385
and the arts to mathematics and neurology [Blo85b]. His study focussed on
deliberate practice and motivated training and overwhelmingly, leaned towards
the concept that experts are always made, not born. These studies make a
clear distinction between regular practice and deliberate practice with the lat-
ter more focussed towards refining the practice to cover all shortcomings and
reaction to unpredictable variables that may stand in the way of an ultimate
victory. “Not all practice makes perfect. You need a particular kind of prac-
tice—deliberate practice—to develop expertise. When most people practice,
they focus on the things they already know how to do. Deliberate practice is
different. It entails considerable, specific, and sustained efforts to do something
you can’t do well—or even at all. Research across domains shows that it is
only by working at what you can’t do that you turn into the expert you want
to become” [KAEC07]. Deliberate practice involves two kinds of learning: im-
proving the skills you already have and extending the reach and range of your
skills. The enormous concentration required to undertake these twin tasks lim-
its the amount of time you can spend doing them. The famous violinist Nathan
Milstein wrote: “Practice as much as you feel you can accomplish with concen-
tration. The general belief of most experts is even the most gifted performers
need a minimum of ten years (or 10,000 hours) of intense training before they
win international competitions, making it difficult and sometimes impossible for
late starters to catch up with competitors who started earlier started earlier and
maintained maximal levels of deliberate practice as rushing through the same
levels of deliberate practice can lead to exhaustion and injuries. Ace Golfer, like
Tiger Woods who started deliberate practice really young in life, is an example
of this approach. Tennis great, Federer himself confessed that he didn’t see
himself as a genius but worked hard at it.

5 The Interplay of Genetics And Training - Epi-


genetics Context
Epigenetics explains how a gene’s expression can be turned on and off due
to external environmental influences, giving athletes an opportunity to work
on the genes that favour their performance. These external influences include
lifestyle choices, diets and nutrition, environmental pollution, stress and anxiety,
quality, and quantity of sleep etc. role epigenetics plays in activating key SNPs
that could eventually impact the various physical and mental parameters that
are crucial for an athlete to perform optimally.
A fine example of this is the increasing emphasis on nutrigenomics research
to provide a personalised nutrition plan to each athlete based on their genetic
makeup to ensure favourable epigenetic transformation. This is the key as it is a
well-known fact that athletic performance depends a great deal on the nutrition
that is given to the athlete. As different body and genetic makeup types respond
differently to the same type of diet regimens, it is essential that the athlete be
given the nutrition that works well for his/her body type. The importance of

10

386
a personalized sports nutrition plan was highlighted by the American College
of Sports Medicine which stated that “Nutrition plans need to be personalized
to the individual athlete. . . and take into account specificity and uniqueness
of responses to various strategies” [TD16]. These strategies encompass over-
all dietary patterns, macronutrient ratios, micronutrient requirements, eating
behaviours (e.g., nutrient timing), and the judicious use of supplements and
ergogenic aids.
Given the high stakes of building champions at the world stage, these stud-
ies also study the genetic variants which have a direct bearing on the way they
absorb, metabolize, utilize, and excrete nutrients. [ND14]. Given the scientific
foundation of this approach it has been found that given gene diet actionable
advice has positively encouraged individuals and they are more likely to change
health behaviors, including their dietary choices and intakes [HJ18], which is a
welcome change. The positive outcome in terms of building muscle power and
endurance levels, agility and speed and physical power and strength has made
this field extremely popular and an increasing number of athletes are depend-
ing on individually tailored dietary and other performance-related information
based on their DNA to stay competitive.

A fine example of this approach is the use of caffeine in the CYP1A2 rs726551
SNP, individuals with the AA genotype (fast metabolizers) to elicit a positive
or “improved” response (i.e., performance). On the contrary, Individuals with
the CYP1A2 AC or CC genotype may either show no effect or an impaired
response to caffeine intake. [GN18]. Another usage example is the one to bring
haemoglobin levels to the optimal level for athletes with diet and supplements.
A low haemoglobin production decreases the oxygen carrying capacity of the
blood, leading to a lack of oxygen to working muscles and resulting in impaired
muscle contraction and aerobic endurance [HJ01].
Increasing interest in epigenetics is also leading to considerable research fo-
cus on research investigating individual variation in response to exercise train-
ing, playing sports and exercise genomics, a key factor in athletic training and
performance enhancement. [VNipAAcFjs17]

11

387
6 Conclusion
Champions are a class apart and have always been the subject of fascination.
It comes as no surprise that their ‘making’ has intrigued many, leading to some
interesting debates on whether they are a product of nature or can be nurtured.
This debate is as old as sport itself. Its theoretical context can be traced back
to accounts of Hippocrates (460–370 BC), the father of medicine, in his Book 1
(Dietetics) where he stressed on the relative nature versus nurture contribution
highlighting the importance of health and the role of a diet and exercise “regi-
men” in maintaining it. “Eating alone will not keep a man well; he must also
take exercise. For food and exercise, while possessing opposite properties, yet
contribute mutually to maintain health. For it is the nature of exercise to use
up material, while of food and drink to restore them. And it is necessary, as it
appears, to determine exactly the powers of various exercises, both natural and
artificial exercises, and which of them contribute to the development of muscle”.
However, even though he seems to be making a reference to the ‘nurture’ aspect
as a requisite for positive health, he, in the same book talks about heritability
which immediately takes us to an individual’s “genetic predisposition”. Cen-
turies later, Galton became the first academic to advocate a hereditary ceiling
to physical and mental capacities [F69,FS92] and formally objected to “pre-
tensions of natural equality” in his landmark paper “The history of twins as a
criterion of the relative powers of nature and nurture”.
It was Ericsson et al who questioned this notion of inherited talent and excep-
tional abilities with his theoretical framework for “deliberate practice”, an alter-
native means to expert performance limiting the role of innate/inherited char-
acteristics on optimal performance. [EK93]. Elite performance, they claimed, is
the “product of a decade or more of maximal efforts to improve performance in
a domain through an optimal distribution of deliberate practice”, thus rejecting
the Galtonian model of innate ability in the making of champions. Ericsson
proposed a very structured approach claiming that “a specific volume of 10,000
h of training to be accumulated over a period of approximately 10 years, as nec-
essary for achieving expert levels [KA.06]. His theory caught the imagination
of a wider audience leading to very motivating publications like Outliers [M08],
The genius in all of us [D.10] etc. These books fuelled a billion-dollar industry
of nutrition and guided fitness.
However, all these theories were questioned based on the ground realities
of the emerging champions, each more gifted than the other. Studies showed
that despite the widespread appeal and popularity of Ericsson’s Framework of
deliberate practice, the more careful analysis of champions did not show an
overwhelming impact of deliberate practice time. Studies show that only 28
Considering the number of body systems that must interact (musculoskele-
tal, cardiovascular, respiratory, nervous, etc.), athletic performance is one of
the most complex human traits. Perhaps the first noticeable difference between
athletes of different specialties is in body morphology (i.e., height and body
composition), with specific body types naturally suited to specific sports. Be-
yond body morphology, endurance, strength, and power are primary factors

12

388
underlying athletic performance.
It is also important to factor that strategy and tactics is not a paper or a
board exercise, it must be implemented in the sporting arena that has its fair
share of real time dynamics. The athletes must be at their peak competency
in terms of their mental, cardiovascular, respiratory, neuromuscular, metabolic,
hormonal, and thermoregulatory systems; each of which have a genetic influence
that is undeniable.
While deliberate practice and the role of environment and a support system
is indeed critically important in the development of elite athletic abilities, dis-
missal of innate abilities resulting from genetic composition altogether in the
making of elite athletes may not be correct. In fact, heritability studies on
physical performance and functional adaptability provided strong evidence of a
significant genetic component to various parameters that ultimately determine
elite performance. Heritability estimates linked to sporting performance, such
as 99
Ironically, these counter findings came mostly at a time when CRISPR gene
editing began to make headlines, leading to a heightened interest in gene dop-
ing and genetic editing to make champions who would be no different from
the natural ones, sometimes even surpassing them with an inserted genetic ad-
vantage. This turn of events has lent yet another scientific dimension to the
nature vs nurture debate, one that is worrying sporting bodies as these genetic
interventions are difficult to trace. [dA08] The raised concerns over genetic
modification or “gene doping” for enhanced performance arise from impressive
studies in genetically modified rodents where manipulation of individual genes
has increased muscle mass, muscle strength, or running endurance, depending
on the gene that was manipulated. Reviews of these animal studies conclude
that such genetic manipulations could also improve human athletic performance
[HL04, SA06, BA07]; inadvertently also solidifying the debate in favour of the
advantages offered by nature. The only irony is that this nature’s advantage
could soon be nurtured.
Evidence from sporting champions clearly shows that both nature and nur-
ture are critical to their success. However, all studies do show the slight ad-
vantage that inheritability offers, one that could make all the difference at the
highest sporting levels. It would be safe to conclude that champions can be
built but only from among those who are favoured by nature bringing us back
to the prophetic statement of Galton, “there is nothing in what I am about say
that shall underrate the sterling value of nurture, including all kinds of sanitary
improvements; may, I wish to claim them as powerful auxiliaries to my cause;
nevertheless, I look upon race as far more important than nurture.”[100,101] He
clearly implied that deliberate practice and environmental factors are undoubt-
edly both critical to sporting excellence, but they do not in themselves produce
elite athletes. The future debate on this may not be as simple as the scientific
community looks to create nature in laboratories.

13

389
References
[AI15] Fedotovskaya ON Ahmetov II. Current progress in sports ge-
nomics. adv clin chem. Review. PubMed: 26231489., 2015.

[AI16] Gabdrakhmanova LJ Fedotovskaya ON Ahmetov II,


Egorova ES. Genes and athletic performance: An up-
date. Med Sport Sci. 2016;61:41-54. doi: 10.1159/000445240.
Epub 2016 Jun 10. Review. PubMed: 27287076., 2016.

[AR99] Weston AR. African runners exhibit greater fatigue resistance,


lower lactate accumulation, and higher oxidative enzyme ac-
tivity. J Appl Physiol, 1999.

[BA07] Rasko JE Emslie KR Baoutina A, Alexander IE. Potential


use of gene transfer in athletic performance enhancement. Mol
Ther 2007;15:1751-66., 2007.

[Bah18] Joel Bahr. Study shows wingspan has a correlation to athletic


prowess in the nba, mma. Berkeley Research, 2018.

[BD17] Vukasović T Bratko D, Butković A. . heritability of personal-


ity. Psychological Topics, 26 (2017), 1, 1-24., 2017.

[BJ05] Deakin J Baker J, Côte J. Expertise in ultra-endurance


triathletes early sport involvement, training structure, and
the theory of deliberate practice. J Appl Sport Psychol.
2005;17:64–78. doi: 10.1080/10413200590907577., 2005.

[Blo85a] B. S Bloom. Generalizations about talent development. in b. s.


bloom (ed.), developing talent in young people (pp. 507–549).
New York: Ballantine Books., 1985.

[Blo85b] Benjamin Bloom. Developing talent in young people. New


York : Ballantine Books, 1985.
[BM09] Pérusse L et al. Bray MS, Hagberg JM. The human gene map
for performance and health-related fitness phenotypes: the
2006–2007 update. Med Sci Sports Exerc;41(1):35–73., 2009.
[Bre09a] B.W. Brewer. Handbook of sports medicine and science, sport
psychology. .Chichester: John Wiley Sons Ltd, 2009.

[Bre09b] B.W. Brewer. Handbook of sports medicine and science, sport


psychology. Chichester: John Wiley Sons Ltd, 2009.

[Bry99] Harter N. Bryan, W L. Studies on the telegraphic language.


the acquisition of a hierarchy of habits. Psychological Review,
6, 345-375., 1899.

14

390
[CA12] Silva AJ et al. Costa AM, Breitenfeld L. Genetic inheritance
effects on endurance and muscle strength: an update. Sports
Med. 2012 Jun 1;42(6):449–58, 2012.
[CKT19] Kwa´sniak K. Czarnik-Kwa´sniak, J. and J. Tabarkiewicz.
How genetic predispositions may have impact on injury and
success in sport. Eur J Clin Exp Med. 16, 366–375. doi:
10.15584/ejcem.2018.4.16, 2019.
[Con09] Hanton S. Connaughton, D. Mental toughness in sport:
Conceptual and practical issues. In S. Mellalieu S. Hanton
(Eds.), Advances in applied sport psychology: A review (pp.
317– 346). London: Routledge., 2009.
[Con10] Hanton S. Jones G. Connaughton, D. The develop-
ment and maintenance of mental toughness in the world’s
best performers. The Sport Psychologist, 24(2), 168– 193.
https://doi.org/10.1123/tsp.24.2.168., 2010.
[Cor11] McGuigan M.R. Newton R.U Cormie, P. Developing maximal
neuromuscular power. Sports Med 41, 125–146 (2011), 2011.
[Coy07] E. F. Coyle. Physiological regulation of marathon perfor-
mance. . Sports Medicine, 37(4-5), 306-311., 2007.
[CP02] Sewell D Clough P., Earle K. Mental toughness: the concept
and its measurement. Solutions in Sport Psychology ed. Cock-
erill I. M. (Boston, MA: Cengage Learning; ) 32–43., 2002.
[CP12a] Perry J. L. Crust L. Clough P., Earle K. Comment on “pro-
gressing measurement in mental toughness: a case example
of the mental toughness questionnaire 48” by gucciardi, han-
ton, and mallett. Sport Exerc. Perform. Psychol. 1 283–287.
10.1037/a0029771, 2012.
[CP12b] Strycharczyk D Clough P. Developing mental toughness: Im-
proving performance, wellbeing and positive behaviour in oth-
ers. . London: Kogan Page Publishers., 2012.
[Cru11] Lee Crust. Mental toughness in sport. International Journal
of Sport and Exercise Psychology Volume 5, Issue 3, 2011.
[D.10] Shenk D. The genius in all of us: new insights into genetics,
talent, and iq. New York: Knopf Doubleday Publishing Group;
2010., 2010.
[dA08] World Anti doping Agency. World anti-doping agency. wada
gene doping symposium calls for greater awareness, strength-
ened action against potential gene transfer misuse in sport.
WADA, 2008.

15

391
[Del13] Byrne R. Doolin R. McInerney R. Ruddock C. Green B. De-
lahunt, E. Anthropometric profile and body composition of
irish adolescent rugby union players. Journal of Strength and
Condition Research, 27, 32523258, 2013.

[DL04] Baluch B Duffy L. Dart performance as a function of facets of


practice amongst professional and amateur men and women
players. Int J Sport Psychol. 2004;35:232–245., 2004.
[DLCS76] W. Evans W. Fink G. Krahenbuhl D. L. Costill, J. Daniels and
B. Saltin. Skeletal muscle enzymes and fiber composition in
male and female track athletes. Journal Of Applied Psychology
vol.40 no.2, 1976.

[DMM07] Cherkas LF et al De Moor MHM, Spector TD. Genome-wide


linkage scan for athlete status in 700 british female dz twin
pairs. Twin Res Hum Genet. 2007 Dec;10(6):812–20, 2007.
[ea08] Overney et al. Enhanced temporal but not attentional pro-
cessing in expert tennis players. PLoS One, 2008; 3 (6): e2380
DOI: 10.1371/journal.pone.0002380, 2008.

[EK93] Tesch-Römer C. . Ericsson KA, Krampe RT. The role of


deliberate practice in the acquisition of expert performance.
Psychol Rev. 1993;100(3):363–406. doi: 10.1037/0033-
295X.100.3.363., 1993.

[EN16] Femia P-et al Eynon N, Ruiz JR. The actn3 r577x polymor-
phism across three groups of elite male european athletes. In:
Garatachea N, editor. PLoS ONE. 8. Vol. 7. 2012. Aug 16, p.
e43132., 2016.

[] r96]XA85 Lehmann A. C. ]Ericsson, K. A. Expert and ex-


ceptional performance: Evidence on maximal adaptations on
task constraints. Annual Review of Psychology, 47, 273–305.,
1996.

[Eri93a] Krampe R. T. H. Tesch-Romer C. Ericsson, K. A. The role of


deliberate practice in the acquisition of expert performance.
Psychological Review, 100, 363–406, 1993.

[Eri93b] Krampe R. Th. Heizmann S. Ericsson, K. A. Ericsson, k. a.,


krampe, r. th., heizmann, s. (1993). can we create gifted peo-
ple? in the origin and development of high ability: Ciba foun-
dation symposium 178. Chichester, England: Wiley., 1993.

[Eri96] K. A. Ericsson. The acquisition of expert performance: An in-


troduction to some of the issues. In K. A. Ericsson (Ed.), The
road to excellence: The acquisition of expert performance in

16

392
the arts and sciences, sports, and games (pp. 1–50). Mahwah,
NJ: Erlbaum., 1996.

[Eri98] K. A Ericsson. Basic capacities can be modified or circum-


vented by deliberate practice: A rejection of talent accounts
of expert performance. A commentary on M. J. A. Howe,
J. W. Davidson, J. A. Sloboda “Innate Talents: Reality or
Myth?” Behavioral and Brain Sciences, 21, 413– 414., 1998.
[Eri99] K. A. Ericsson. The scientific study of expert levels of perfor-
mance: General implications for optimal learning and creativ-
ity, creative expertise as superior reproducible performance:
Innovative and flexible aspects of expert performance. Psy-
chological Inquiry, 10, 329–333., 1999.
[Eri02] K. A. Ericsson. Attaining excellence through deliberate prac-
tice: Insights from the study of expert performance. In M. Fer-
rari (Ed.), The pursuit of excellence in education (pp. 21–55).
Hillsdale, NJ: Erlbaum., 2002.

[Eri03a] K. A. Ericsson. The development of elite performance and


deliberate practice: An update from the perspective of the
expert-performance approach. In J. Starkes K. A. Ericsson
(Eds.), Expert performance in sport: Recent advances in re-
search on sport expertise (pp. 49–81). Champaign, IL: Human
Kinetics., 2003.
[Eri03b] K. A. Ericsson. The search for general abilities and basic ca-
pacities: Theoretical implications from the modifiability and
complexity of mechanisms mediating expert performance. In
R. J. Sternberg E. L. Grigorenko (Eds.), Perspectives on
the psychology of abilities, competencies, and expertise (pp.
93–125). Cambridge, England: Cambridge University Press.,
2003.

[Eyn13] Hanson E. D. Lucia A. Houweling P. J. Garton F. North K.


N. et al Eynon, N. Genes for elite power and sprint perfor-
mance: Actn3 leads the way. Sports Med. 43, 803–817. doi:
10.1007/s40279-013-0059-4, 2013.
[F69] Galton F. Hereditary genius. an inquiry into its laws and
consequences. London and New York: Macmillan and Co;
1869., 1869.
[Fal04] Lidor R. Lander Y. Lang B Falk, B. Talent identification and
early development of elite water-polo players: a 2-year follow-
up study. Journal of Sport Sciences, 22, 347–355, 2004.

17

393
[FEMNes] Mike I. Lambert Frank E. Marino and Timothy D. Noakes.
Superior performance of african runners in warm humid but
not in cool environmental conditions. Journal of Applied Phys-
iology Vol. 96, No. 1, Frank E. Marino, Mike I. Lambert, and
Timothy D. Noakes.

[Fri06] A.D. Friederici. The neural basis of language development and


its impairment. Neuron, 52 (2006), pp. 941-952, 2006.
[FS92] Galton FS. Hereditary genius. an inquiry into its laws and
consequences. New York: MacMillan Co; 1892., 1892.
[Fyl09] Kevin Fylan. Federer redefines notion of greatness. Reuter,
Sports News, 2009.

[GL13] Roth SM. Guth LM. Genetic influence on athletic perfor-


mance. curr opin pediatr. Pubmed, 2013.

[GL15] Safa Khayat Mughrabi Gerry Leisman, Raed Mualem. The


neurological development of the child with the educational
enrichment in author links open overlay panel. Psicologı́a Ed-
ucativa, 2015.

[GN18] Vescovi J El-Sohemy A Guest N, Corey P. . caf-


feine, cyp1a2 genotype, and endurance performance in
athletes. Med Sci Sports Exerc. (2018) 50:1570–8.
10.1249/MSS.0000000000001596, 2018.

[Gou02] Dieffenbach K. Moffett A. Gould, D. Psychological character-


istics and their development in olympic champions. Journal
of Applied Sport Psychology, 14(3), 172– 204., 2002.

[GR13] L. M. Guth and S. M Roth. Genetic influence on ath-


letic performance. Curr. Opin. Pediatr. 25, 653–658. doi:
10.1097/MOP.0b013e32836 59087, 2013.

[GT98] McConnell A et al Gibbons T, Hill R. The path to excellence:


Initial survey. A comprehensive view of development of U.S.
Olympians who competed from 1984– 1998, 1984-1998.

[Had08] Adam Hadhazy. What makes michael phelps so good? do


phelps’s body shape and flexibility give the eight-gold medal
winner a physical edge in swimming? Scientific American,
2008.
[Hay17] Jessa Hay. A gene ahead of the game: A look at sports genet-
ics jessa hay eastern kentucky university. Eastern Kentucky
University Encompass, 2017.

18

394
[HJ01] Brownlie TIV. Haas JD. Iron deficiency and reduced work
capacity: a critical review of the research to determine a causal
relationship. J Nutr. (2001) 131:676S688S; discussion 688S-
690S. 10.1093/jn/131.2.676S, 2001.

[HJ18] O’Connor C Shelley J Gilliland J Horne J, Madill J. A system-


atic review of genetic testing and lifestyle behaviour change:
are we using high-quality genetic interventions and considering
behaviour change theory? Lifestyle Genom. (2018) 11:49–63.
10.1159/000488086, 2018.

[HL04] Sweeney HL. Gene doping. Sci Am 2004;291:62-9., 2004.

[Ing19] Sean Ingle. Caster semenya accuses iaaf of using her as a


’guinea pig experiment. The Guardian, 2019.

[JE.70] Carter JE. The somatotypes of athletes–a review. Hum Biol.


1970 Dec;42(4):535–69., 1970.

[KA.06] Ericsson KA. The cambridge handbook of expertise and ex-


pert performance. New York: Cambridge University Press;
2006, 2006.

[KAEC07] Michael J. Prietula K. Anders Ericsson and Edward T. Cokely.


The making of an expert. Havard Business Review, 2007.

[Kel58] F. S. Keller. The phantom plateau. Journal of the Experi-


mental Analysis of Behavior, 1, 1- 13., 1958.
[KP73] Karvinen E. Komi PV, Klissouras V. Genetic varia-
tion in neuromuscular performance2. Eur J Appl Physiol.
1973;31(4):289–304. doi: 10.1007/BF00693714., 1973.

[KP77] Havu M Thorstensson A Sjödin B Karlsson J. Komi PV, Vi-


itasalo JH. Skeletal muscle fibres and muscle enzyme activities
in monozygous and dizygous twins of both sexes. Acta Physiol
Scand. 1977;100(4):385–392., 1977.
[Lar03] Henrik Larsen. Kenyan dominance in distance running.
comparative biochemistry and physiology. Part A, Molec-
ular integrative physiology. 136. 161-70. 10.1016/S1095-
6433(03)00227- 7., 2003.

[LP13] G. Löffler and P Petrides. Biochemie und pathobiochemie.


Springer, 2013.
[M08] Gladwell M. . outliers: the story of success. USA: Little,
Brown and Company; 2008., 2008.

19

395
[McP11] Perez-Schindler J. Degens H. Tomlinson D. Hennis P. Baar K.
et al. McPhee, J. S. Hif1a p582s gene association with en-
durance training responses in young women. Eur. J. Appl.
Physiol. 111, 2339–2347. doi: 10.1007/s00421-011-1869-4,
2011.

[MD94] Daniels JT. .Morgan DW. Relationship between vo2max and


the aerobic demand of running in elite distance runners. In-
ternational Journal of Sports and Medicine, 1994.

[ME19] Montel I. McGill E. Nasm’s essentials of sports performance


training (2nd ed.). Burlington, MA: Jones Bartlett Publish-
ing, 2019.

[Miz23] Yasuhiro Mizuki. Extreme medialized repair for challenging


large and massive rotator cuff tears reveals healing and sig-
nificant functional improvement. Arthroscopy - Journal of
Arthroscopic and Related Surgery, 2023.

[MJ04] Klissouras V. Missitzi J, Geladas N. Heritability in neuro-


muscular coordination: implications for motor control strate-
gies. . Med Sci Sports Exerc. 2004;36(2):233–234. doi:
10.1249/01.MSS.0000113479.98631.C4., 2004.

[MJ11] Geladas N Politis P Karandreas N Classen J Missitzi J, Gen-


tner R. Plasticity in human motor cortex is in part genetically
determined. J Physiol. 2011;589:297–306. doi: 10.1113/jphys-
iol.2010.200600., 2011.

[MJ13] Misitzi A Geladas N Politis P Klissouras V Classen J.


Missitzi J, Gentner R. Heritability of motor control
and motor learning. Physiol Rep. 2013;1(7):1–10. doi:
10.1002/phy2.188., 2013.

[MJ17] Joyner MJ. Exercise and trainability: contexts and


consequences. J Physiol (Lond). 2017;595(11):3239-3240.
doi:10.1113/JP274031, 2017.

[MJ19a] Joyner MJ. Genetic approaches for sports performance: How


far away are we? Sports Med. 2019;49(Suppl 2):199-204.
doi:10.1007/s40279-019-01164-z, 2019.

[MJ19b] Joyner MJ. Genetic approaches for sports performance: How


far away are we?. Sports Med. 2019;49(Suppl 2):199-204.
doi:10.1007/s40279-019-01164-z, 2019.

[MM06] C.J. Shatz M. Majdan. Effects of visual experience on activity-


dependent gene regulation in cortex. Nature Neuroscience, 9
(2006), pp. 650-659, 2006.

20

396
[Mon98] Marshall R. Hemingway H. Myerson S. Clarkson P. Dollery
C. et al Montgomery, H. E. Human gene for physical perfor-
mance. Nature 393, 221–222. doi: 10.1038/30374, 1998.

[Mus23] Ossian Muscad. Unraveling the physics of sports: Why ath-


letes need to understand the concept of torque. Datamyte,
2023.

[ND14] El-Sohemy A. Nielsen DE. Disclosure of genetic information


and change in dietary intake: a randomized controlled trial.
PLoS ONE (2014) 9:e112665. 10.1371/journal.pone.0112665,
2014.

[NSC17] NSCA. Sport performance and body composition by nsca’s


guide to tests and assessments. kinetic select, 2017.

[PM07] Loos RJF et al. Peeters MW, Thomis MA. Heritability of


somatotype components: a multivariate analysis. Int J Obes
Relat Metab Disord, 2007.
[PM09] Marzano C Moroni F Pirulli C Curcio G Ferrara M
Miniussi C Rossini PM De Gennaro L. Pellicciari MC, Ve-
niero D. Heritability of intracortical inhibition and fa-
cilitation. J. Neuroscience. 2009;29(28):8897–8900. doi:
10.1523/JNEUROSCI.2112-09.2009, 2009.
[PP18] Pavol Peráček and Janka Peráčková. Tactical preparation in
sport games and motivational teaching of sport games tactics
in physical education lessons and training units. Intechopen,
2018.

[PR04] Caspi A Plomin R, Happe F. Psychiatric, genetics genomics.


New York, NY: Oxford University Press Inc; 2004. Personal-
ity and cognitive abilities; pp. 77–112, 2004.

[PR15] Pluess M. Power RA. Heritability estimates of the big five


personality traits based on common genetic variants. Trans-
lational Psychiatry (2015) 5, e604; doi:10.1038/tp.2015.96;
published online 14 July 2015. PubMed: 26171985 PubMed
Central: PMC5068715, 2015.

[Pre08] Associated Press. Phenomenal phelps wins 7th gold by 0.01


seconds to tie spitz. ESPN, 2008.
[PZ11] Rawal J et al. Puthucheary Z, Skipworth JRA. Genetic in-
fluences in sport and physical performance. Sports Med. 2011
Oct 1;41(10):845–59., 2011.

21

397
[PZ3] Kaliszewski P. Majorczyk E. Pokrywka, A. and ´ A. Zembron-
L
 acny. Genes in sport and doping. Biol. Sport. 30, 155–161.
doi: 10.5604/20831862.1059606, 2013.

[Roc86] Martines Roca. Técnicas de entrenamiento. Semantic Scholar,


1986.

[SA06] Friedmann T Schneider AJ. Gene doping in sports: the sci-


ence and ethics of genetically modified athletes. Adv Genet
2006;51:1-110., 2006.

[Sch88] Curnow C Schulz, R. Peak performance and age among super-


athletes: Track and field, swimming, baseball, tennis, and golf.
Journal of Gerontology: Psychological Sciences, 43,113- 120.,
1988.

[Sch04] Wagner K. R. Stolz L. E. Hübner C. Riebel T. Kömen W. . . .


Lee S. J. Schuelke, M. Myostatin mutation associated with
gross muscle hypertrophy in a child. New England Journal of
Medicine, 350(26), 2682-2688., 2004.
[SEM16] What is an intersex athlete? explaining the case of caster
semenya. The Guardian, 2016.
[She09] Gabbett T. Reeberg Stanganelli L-C Sheppard, J. An analysis
of playing positions in elite men’s volleyball: Considerations
for competition demands and physiological characteristics.
Journal of strength and condition research, 23, 1858–1866,
2009.
[SK08] Tynelius P et al Silventoinen K, Magnusson PKE. Heritability
of body size and muscle strength in young adulthood: a study
of one million swedish men. Genet Epidemiol., 2008.

[SO18] Jaime Serra-Olivares. Sport pedagogy - recent approach to


technical-tactical alphabetization. IntechOpen, 2018.

[SS13] Ficek K. Eider J. Leonska-Duniec A. Maciejewska-´ Karlowska


A. Sawczuk M. et al Stepien-Slodkowska, M. The +1245g/t
polymorphisms in the collagen type i alpha 1 (col1a1) gene
in polish skiers with anterior cruciate ligament injury. Biol.
Sport 30, 57–60. doi: 10.5604/20831862.1029823, 2013.

[TD16] Burke LM Thomas DT, Erdman KA. American college of


sports medicine joint position statement. nutrition and ath-
letic performance. Med Sci Sports Exerc. (2016) 48:543–68.
10.1249/MSS.0000000000000852, 2016.

22

398
[Tes85] Wright J.E. Vogel J.A. et al Tesch, P.A. The influence of mus-
cle metabolic characteristics on physical performance. Europ.
J. Appl. Physiol. 54, 237–243, 1985.

[The05] Weston N. Greenlees I. Thelwell, R. Defining and understand-


ing mental toughness within soccer. Journal of Applied Sport
Psychology, 17(4), 326– 332., 2005.

[TM97] Maes HH Blimkie CJR Claessens AL-Marchal G Willems


E Vlietinck RF Beunen GP Thomis MA, Van Leem-
putte M. Multivariate genetic analysis of maximal isomet-
ric muscle force at different elbow angles. J Appl Physiol.
1997;82:959–967., 1997.

[TP85] Karlsson J. Tesch P.A. Muscle fiber types and size in trained
and untrained muscles of elite athletes. Journal Of Applied
Psychology, 1985.
[Uni21] Anglia Ruskin University. Genes play key role in exercise out-
comes. ScienceDaily, 2021.
[Vae08] Lenoir M. Williams A.M. Philippaerts R Vaeyens, R. Tal-
ent identification and development programs in sport. Sports
Medicine, 38 (9), 703–714, 2008.

[VNipAAcFjs17] Griffiths LR Wang G Pitsiladis YP Pigozzi F et al.Genetic


testing for exercise prescription Vlahovich N, Hughes DC
and injury prevention: AIS-Athlome consortium-FIMS joint
statement. Genetic testing for exercise prescription and in-
jury prevention: Ais-athlome consortium-fims joint statement.
BMC Genomics (2017) 18(Suppl. 8):818. 10.1186/s12864-
017-4185-5, 2017.

[WAik] Smith A Noakes TD Myburgh KH Weston AR, Karamizrak O.


African runners exhibit greater fatigue resistance,
lower lactate accumulation, and higher oxidative en-
zyme activityj appl physiol (1985). 1999 mar;86(3):915-
23. doi: 10.1152/jappl.1999.86.3.915. pmid: 10066705.
J Appl Physiol (1985). 1999 Mar;86(3):915-23. doi:
10.1152/jappl.1999.86.3.915. PMID: 10066705., 1985Larsen,
Henrik.

[Wep03] Gould D. Foundations of sport Weinberg, R.S. and exercise


psychology. Foundations of sport and exercise psychology.
Champaign: Human Kinetics, 2003.

[WN15] McNamee M Bouchard C Pitsiladis Y Ahmetov I Ashley E


Byrne N Camporesi S Collins M Dijkstra P Eynon N Fuku
N Garton FC Hoppe N Holm S Kaye J Klissouras V Lucia

23

399
A Maase K Moran C North KN Pigozzi F Wang G. Web-
born N, Williams A. Direct-to-consumer genetic testing for
predicting sports performance and talent identification: Con-
sensus statement. Br J Sports Med. 2015 Dec;49(23):1486-91.
doi: 10.1136/bjsports-2015-095343. PubMed: 26582191. Free
full-text available from PubMed Central: PMC4680136., 2015.

[WR12] Pitsiladis YP Wilber RL. Kenyan and ethiopian distance run-


ners: what makes them so good? International Journal of
Sports Psychology and Performance, 2012.

[XA6] The mental game plan: Getting psyched for sport.

[XA7] How do your genes influence levels of emotional sensitivity? a


specific gene variation can make your brain more emotionally
sensitive.

[Yam07] Itoi E. Tuoheti Y. Seki N. Abe H. Minagawa H. . . . Okada K


Yamamoto, N. Glenohumeral joint motion after medial shift of
the attachment site of the supraspinatus tendon: a cadaveric
study. Journal of Shoulder and Elbow Surgery, 16(3), 373-
378., 2007.

[YX16] Lidor R Eynon N. Yan X, Papadimitriou I. Nature ver-


sus nurture in determining athletic ability. Med Sport Sci.
2016;61:15-28. doi: 10.1159/000445238. Epub 2016 Jun 10.
Review. PubMed: 27287074., 2016.

[Zap11] Kororos P. Christodoulidis T. Skoufas D. Bayios I Zapartidis,


I. Profile of young handball players by playing position and
determinants of ball throwing velocity. Journal of Human
Kinetics, 27, 17–30, 2011.

24

400

You might also like