Professional Documents
Culture Documents
Navya Nori
between the ages 12 and 25, primarily between 16 and 17 [1]. Anorexia Nervosa is a mental
condition in which people starve themselves to achieve unhealthy weight loss. 1 out of every 5
Anorexia deaths is by suicide, and hospitalizations for this condition increased by 119% in the last
decade [2]. 35-55% of adolescent girls engage in fasting, self-induced vomiting, or diet pills [3].
Eating disorders as a whole have the highest mortality rate out of any mental illness and
characterizing a patient’s biological and psychological manifestations is the key to reducing this
statistic [4]. Due to the sudden onset of the Covid-19 pandemic, adolescents all over the world
have spent more time alone behind closed doors criticizing their eating habits and bodies,
especially due to the heavy influence of social media and lack of positive interactions [5]. These
habits both exacerbate ED risk, severity, and mental health. Eating disorders can be triggered in a
patient due to a multitude of factors, including negative media exposure, social isolation, irregular
eating patterns, and more. Anorexia Nervosa, a type of eating disorder, is the third most prevalent
chronic illness among adolescents. Current anorexia prevention efforts target the mental health
and emotional response and act as a trusted individual for the patient to communicate with [6].
These options include teaching users to be mindful and reduce stress levels, offering various
therapy methods, and recognizing negative thoughts. However, these methods do not consider the
recommendation.
disorder and tracks both their psychological and biological manifestations, provides a treatment
plan tailored to their needs, and has the potential to reduce the staggering mortality rate of Anorexia
Nervosa in adolescents. There are two main components to this project: AI/ML models based on
social media interactions (computational component) and bacterial growth analysis (laboratory
component). The research question for the overall project is “Can a combination of mathematical
modeling and bacterial growth analysis be used to predict future mental illness severity for
Anorexia Nervosa?” and for the lab component is “What is the effect of carbohydrate starvation
(g/mL) on the gut microbiome (cells/mL)?” The hypothesis for the overall project was that a
mathematical model can be combined with lab testing to predict future mental illness severity
(MIS) in Anorexia Nervosa patients. The lab component hypothesis predicted that if the glucose
concentration reaches 60%, the gut bacteria population will begin to decline significantly. The
independent variables used in the laboratory component of this study was the carbohydrate
concentration (g/mL), and the dependent variable was the bacterial absorbance measured by a
spectrophotometer and later converted to cell count using beer-lambert law. Controlled variables
in this experiment included the volume of each trial (6mL), amount of bacteria introduced, (1
inoculating loop’s worth), wavelength used to measure absorbance (600nm), and the temperature
of the incubator (60 degrees Celsius). Two notable constants were the room temperature, and
The materials used in this study can be broken up into two categories: materials and equipment. 3
pieces of equipment were used in this experiment, an incubator, autoclave, and spectrophotometer.
To conduct the experiment, the materials included volumetric pipettes, 16 cuvettes, parafilm, a
cuvette rack, thioglycolate medium, Clostridium sporogenes in its liquid tube form, sterile
Computational methods
The first step was to obtain a Twitter developer accounts. The python library Tweepy was used to
collect streaming tweets from Twitter over two months. Then, the tweets were sorted into cases,
which were indicative of Anorexia, or healthy control tweets. These decisions were made using
the DSM 5 clinical criteria for Anorexia Nervosa. The data was then split into train, validation,
and test and stop words, punctuation, and extra characters were removed. This step ensure that the
models are only analyzing word choice. Term frequencies were calculated using the tweets. A
regularized regression was built and statistics were computed, and then a gradient boosted machine
and generalized linear model were built using matrices. The resulting coefficients were analyzed
using shap package for interpretability. Finally, receiving operating characteristics and precision
recall charts were created using the scores from the models. The coefficients, or values of
importance for each word, were implemented into the Android application, SupportED.
Laboratory methods
The purpose of the laboratory experiment was to determine the percentage of the full carb content
at which the gut microbiome’s health is compromised. 16 cuvettes and a cuvette rack were
sterilized in an autoclave before use to eliminate the possibility of extraneous bacterial growth. 15
sterile cuvettes were placed on a cuvette rack, there were five staggered concentrations of glucose
and three trials of each. They were labelled according to concentration number and trial number.
For example, concentration one had 1500 g/mL of glucose and concentration 5 had 5500 g/mL
(the maximum). 0.54 mL of thioglycolate medium was pipetted into the first cuvette using a sterile
volumetric pipette, and this action was repeated 2 more times for the different trials. Then, 1.09
mL of the medium was pipetted into each cuvette for concentration 2, and so on. The final
concentration, concentration 5, had 6mL of the medium. Then, the complementary amount of
distilled water was pipetted into each cuvette making the volume of every cuvette 6 mL. Each
cuvette was sealed with parafilm and covered with an opaque box to prevent any light degradation.
The next day, sterile inoculating loops were used to inoculate Clostridium sporogenes bacteria in
the form of a liquid tube culture. This particular species of bacteria was chosen because of its
nativity to the gut microbiome. The tubes were re-sealed with parafilm and not opened again in
the study, making this a BSL 1 experiment. The cuvettes were then incubated at 40 degrees Celsius
overnight. The next day, the cuvettes were removed from the incubator. The spectrophotometer
was calibrated, and the wavelength was set to 600nm in the absorbance setting. Blanks were
created for each concentration mirroring the process from day one and run before each
experimental cuvette. Creating 5 blanks account for the color gradient caused by the dilutions and
established that changes in absorbance value are due to the bacterial growth only and not any other
factors. Each absorbance value was recorded in a data table and the cuvettes were placed back in
the incubator. Over the next two days, the same data collection process was repeated. Finally, the
bacteria and growth medium were disposed of properly and the facility was clean thoroughly.
The final stage of this study was the app creation, or the bridge between the computational and lab
components. The user opens the app and completes the initial set up process and then the emotional
monitoring begins. The models score Instagram post captions for words that are indicative of
Anorexia. If the post is highly indicative, the user is sent a survey asking what they consumed in
the past 24 hours, which is divided by the recommended number of carbs they should have per
day. The app compares this value to the threshold calculated by the lab experiment, which was
recommendation and sends that as well as an alert to the supportER, or linked account interface
A logo was created using paint software. In android studio, 8 screens and wireframes were
created and designed using XML layout editing. To store data, two options were evaluated: the
firebase real time database and the cloud datastore. Each user interface screen is an android activity
and the MainActivity is the starting point of the application. For user authentication, google sign-
in was integrated and event listeners were added to the screen for UI elements such as buttons and
dropdowns. Finally, the app was connected to the google firebase backend and a cohesive color
Results
The glucose concentrations for the lab were numbered as concentration 1-5, where 1 is the most
diluted and 5 is the undiluted broth. A T test was used to compare each of the concentrations (1-
4) against concentration 5. On day 1, the T value is the highest because the least growth
occurred, demonstrating the larger difference between concentrations 1-4 and 5. The threshold
calculated by this experiment is the percentage of the undiluted medium at which the bacterial
population begins to decline significantly. On day 2, the T value decreased to a greater extent
and similar changes occurred in the P value and confidence interval. Between concentration 2
and 3, the P value increased to above 0.05, making the data no longer significant. Over the same
interval, the confidence interval began to include 0, accepting the null hypothesis after
concentration 3 because no significant decline in bacterial growth occurs. Thus, the threshold lies
This table shows the mentioned T values, p values, and confidence intervals by day (1-3) and by
concentration (1-5), where the threshold occurs between concentration 2 and 3 at the red line.
Computation results: the GLM and GBM models produced coefficients, or values of importance,
Words Importance
Skinny 2.02
Weight 1.94
Eat 1.66
Thinspo 1.54
Stop 1.33
Look 0.9
Lose 0.46
Healthy -0.75
Optimistic -1.27
Living -2.09
Amazing -2.57
Kindness -3.22
The most indicative of Anorexia is the word “skinny” with a coefficient of 2.02. Negative
coefficients in this table indicate no association with anorexia. More neutral words are found in
Shapley values computed using the GBM model to show their relative important is shown below.
Variables are ranked in descending order based on feature importance (y-axis). Each dot
represents an observation (row). The position along x-axis is the shap value of that feature for that
observation. The color shows whether that feature was high or low (how significant) for that row
of the dataset.
A confusion matrix shows how well the models recognizes actual cases as cases and actual controls
as controls. The confusion matrix from the GLM model is shown below.
There is the most data in the predicted control, actual control and predicted case, actual case
sections. This shows the GLM model is accurate and reliable and that at a threshold of 50%, the
accuracy is high. The confusion matrix for the GBM model is shown below.
There is the most data in the predicted control, actual control and predicted case, actual case
sections. This shows the GBM model is accurate and reliable and that at a threshold of 50%, the
accuracy is high.
These graphs show the accuracy of the training data for the GLM and GBM models.
The sensitivity, specificity, and accuracy at various thresholds are shown from the GLM model.
At approximately the threshold of 0.67, all three measures are optimal. From 0.3 to 0.64, the
At approximately the threshold of 0.5, all three measures are optimal. At a similar threshold, the
accuracy is optimized.
A combination of mathematical modeling and bacterial growth analysis can be used to predict
future MIS and monitor progression of Anorexia Nervosa. The carbohydate starvation threshold
at which the growth of gut bacteria begins to drop significantly is approximately 50.13%. This
system is not costly and acts in real-time, making it a viable solution in versatile settings. Future
(CNN) model for images, and sequential model (RNN) for text. Another enhancement possibility
is the use of more data, which would improve the vocabulary, and potentially the accuracy of the
models. Finally, the app could be connected to a data base which contains all the carbohyrate
values for various food and produce items. Currently, the carb values for several common food
items have been anually inputted into the app and its automation would increase the solutions
versatility.
Acknowledgements
I would like to thank Ms. Martinez and Ms. Riley for their support, and Milton High School for
References
[1] Anorexia Nervosa in Teens: The Two-Year Window. Integrated Care Clinic, 2017 [URL]
[3] Child Eating Disorders on the Rise. Cable News Network (CNN), 2012 [URL]
[4] Eating Disorders: New Solutions. American Psychological Association, 2009 [URL]
[5] South Carolina Department of Mental Health. Eating Disorder Statistics, 2015 [URL]
[6] Study Shows Covid-19 Has Wide-Ranging Effects on Eating Disorder Concerns. Spectrum
News 1, 2021 [URL]