You are on page 1of 26

Autism Spectrum Disorder

Through Gut Microbiota

PAWAR AYUSH BABASAHEB


(120AD0026)

under the guidance of

Dr. K. Nagaraju

Department of Computer Science and Engineering Indian Institute of Information


Technology Design and Manufacturing Kurnool

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Outline

● Introduction
● Problem Statement
● Literature Survey
● Dataset
● Data Pre-processing
● Data Visualization
● Training
● Results
● References
(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023
Introduction

Figure 1 : Gut Brain Axis in ASD

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Introduction

● Individuals with ASD may exhibit differences in the composition of


their gut microbiota compared to neurotypical individuals.
● The gut and brain are connected through the gut-brain axis,
allowing bidirectional communication. Changes in the gut
microbiota could potentially influence brain function and
behavior.
● Machine learning algorithms can analyze large datasets, including
those related to the gut microbiota, to identify patterns or
associations that may be challenging for traditional methods.
● Using machine learning we will develop predictive models that can
help identify individuals at risk of ASD based on their gut
microbiota profile. These models can potentially contribute to
early diagnosis and intervention.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Problem Statement

Let X be a matrix representing the abundance or presence of thousands of microbial


species in the gut

Let Y be a binary vector indicating the presence (1) or absence (0) of Autism Spectrum
Disorder (ASD) for each corresponding individual.

Where:

● f(X) represents a measure or function indicating the relevance or association of the


microbial species with ASD.

● f(X) greater than the threshold, indicating presence of ASD.

● f(X) less than or equal to the threshold, the indicating absence of ASD.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Literature Survey

Machine Learning Data Analysis Highlights the Role of


Parasutterella and Alloprevotella in Autism Spectrum
Disorders
Biomedicines, 2022.

● The paper investigates the role of the gut microbiota in Autism


Spectrum Disorder (ASD) and the challenges in identifying a typical
dysbiosis profile in ASD patients
● The study collected 959 samples from eight publicly available
projects, including 540 ASD patients and 419 Healthy Controls (HC).
● Machine Learning (ML) algorithms, including Random Forest,
Support Vector Machine, and Gradient Boosting Machine, were
applied to create a predictor to discriminate between ASD and HC.
● ML algorithms identified five different genera, including
Parasutterella and Alloprevotella, as important in discriminating
between ASD and HC.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Literature Survey

Potential of gut microbiome for detection of autism spectrum


disorder
Microbial Pathogenesis 149 (2020) 104568

● The paper explores the potential of the gut microbiome for


detecting autism spectrum disorder (ASD), a neurodevelopmental
disorder characterized by abnormal social behaviors.
● They used principal component analysis (PCA) and random forest
analysis for biomarker identification.
● Several genera, including Prevotella, Roseburia, Ruminococcus,
Megasphaera, and Catenibacterium, were identified as potential
biomarkers of ASD.
● The random forest model showed the highest performance, with an
F1 score of 0.74 and an area under the curve of 0.827, indicating the
reliability and generalizability of the predictive model.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Literature Survey

Systematic review: Autism spectrum disorder and the gut


microbiota
Acta Psychiatr Scand, 2023.

● The paper is a systematic review on the interrelations between ASD


and the gut microbiota in children. It aims to elucidate the current
knowledge of the composition of gut microbiota in children with
ASD and explore potential biomarkers and therapeutic
interventions.
● The review included studies focusing on the gut microbiota
composition in children aged between 2 and 18 years with ASD.
● The articles were reviewed by a group of researchers who
performed consensus meetings on article inclusion and exclusion.
● Higher levels of Proteobacteria, Actinobacteria, and Sutterella were
consistently observed in the gut microbiota of children with ASD
compared to healthy controls.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Datasets

Table 1 : Summary of Dataset


Dataset ASD Neurotypical Total Labeled
Name count count Data

Meta Abundance 30 30 60 yes


Dataset

16s RNA Dataset 143 111 254 yes

Shape of meta_abundance dataset is: (5619, 61)


Shape of 16s rRNA dataset is: (1322, 256)

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Data Pre-processing

Need of Pre-processing

● The Dataset has thousands of microbials data, microbials


with less abundance need to remove.
● To reduce risk of data duplication and redundancy,
normalization is used to eliminate or minimize these issues.
Redundant data can lead to inefficiencies and
inconsistencies.
● Converting id’s into vector with 1 as AD and 0 as
neurotypical to get Target vector.
● categorical variables need to be converted into a numerical
format

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Data Pre-processing

Steps for Pre-processing of dataset


● Filtering microbials with less abundance. Where mean of all
values of microbiota is less than 1 will be dropout.
● Normalizing all values of microbiota between 0-1.
● After removing target variable from dataset, using
taxonomy of microbes as index.
● Transposing data set
● Creating target vector from id’s , Id which start with A as
ASD and B as neurotypical.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Data Pre-processing

Figure 2.1 : Raw ASD Meta Abundance dataset

Figure 2.2 : After Pre-processing dataset

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Data Visualization

Figure 3.1 : PCA of meta


abundance dataset

Figure 3.2 : PCA of 16s


RNA dataset

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Data Visualization

Figure 4.1 : UMAP of 16s Figure 4.2 : UMAP of meta


RNA dataset abundance dataset

Uniform Manifold Approximation and projection


(IIITDM, Kurnool) Design Project - 2023 November 30th 2023
Training

● Training models on Meta abundance Dataset


● We Used K fold technique to train on our whole dataset
● Logistic Regression , Random Forest , Gradient boost and
SVM are algorithms we used to train.
● Random forest and Gradient boost trained on array
containing number of trees.
● Trees with highest accuracy chosen for training.

Figure 5.1 : Score of Model on Metadata


abundance
(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023
Training

Logistic SVM Random Gredient


Reggration Forest Boost

Figure 5.2 : plotting Score of Model on Metadata


abundance

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Training

● Similarly Training models on 16s RNA Dataset


● Using 5 fold validation
● Random forest and Gradient boost trained on array containing number
of trees.
● Trees with highest accuracy chosen for training.

Figure 6.1 : plotting Score of Models on 16s


RNA dataset
(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023
Training

Logistic SVM Random Gredient


Reggration Forest Boost

Figure 6.2 : plotting Score of Models on 16s RNA


Dataset
(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023
Results

● After Training Phase , we found out that Random forest


and Gradient boost algorithm is performing well on both
the datasets.
● For further testing we divided the initial pre-processed
dataset.
● Dividing it into train test , where 80% data is for Training
and 20% data for testing
● Performed Training random forest and Gradient boost
model with two training datasets of Meta abundance and
16s RNA dataset.
● Then using trained model predicted for test dataset.
● Got F1 score of 0.93 on best performing model

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Results
Table 2 : Train test split of datasets

Meta Abundance 16s RNA Total


dataset dataset

Train Samples 48 203 251

Test Samples 12 51 63

Table 3 : Accuracy of Gradient boost and Random forest

Meta Abundance 16s RNA dataset


dataset

Gradient Boost 0.75 0.80

Random Forest 0.83 0.90

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Results

Table 4 : ASD and HC discriminating genera in Meta abundance dataset


g__Prevotella;s__Prevotella sp. CAG:1320

g__Bacteroides;s__Bacteroides sp. CAG:633

g__Prevotella;s__Prevotella amnii

g__Bacteroides;s__Bacteroides sp. Marseille-P2824

g__Prevotella;s__Prevotella sp. P5-125

Table 5 : ASD and HC discriminating genera in 16s RNA dataset


_g__Ruminiclostridium__s__[Eubacterium]_siraeum

_g__[Eubacterium]_xylanophilum_group

g__Ruminiclostridium

_g__Blautia;_s__Ruminococcus_sp

_g__Ruminococcaceae_

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Results

Figure 7.1 : Gradient boost Figure 7,2 : Random


Confusion matrix on Meta Forest Confusion matrix
abundance dataset on Meta abundance
dataset

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


Results

Figure 8.1 : Gradient boost Figure 8.2 : Random


Confusion matrix on 16s Forest Confusion matrix
RNA dataset on 16s RNA dataset

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


References

❖ Daniele Pietrucci, Adelaide Teofani, Marco Milanesi, Bruno Fosso, Lorenza


Putignani, Francesco Messina, Graziano Pesole, Alessandro Desideri and
Giovanni Chillemi. “Machine Learning Data Analysis Highlights the Role of
Parasutterella and Alloprevotella in Autism Spectrum Disorders”
Biomedicines 2022.

❖ Tong Wu, Hongchao Wang, Wenwei Lu, Qixiao Zhai, Qiuxiang Zhang,
Weiwei Yuan, Zhennan Gu, Jianxin Zhao, Hao Zhang, Wei Chen “Potential
of gut microbiome for detection of autism spectrum disorder” Microbial
Pathogenesis 149 (2020) 104568

❖ Jenni Korteniemi, Linnea Karlsson, Anna Aatsinki “Systematic review:


Autism spectrum disorder and the gut microbiota” Acta Psychiatrica
Scandinavica published by John Wiley & Sons Ltd 2023.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


References

❖ Mingbang Wang, Ceymi Doenyas, Jing Wan, Shujuan Zeng , Chunquan Cai,
Jiaxiu Zhou , Yanqing Liu , Zhaoqing Yin , Wenhao Zhou “Virulence
factor-related gut microbiota genes and immunoglobulin A levels as novel
markers for machine learning-based classification of autism spectrum
disorder” . Published by Elsevier B.V. on behalf of Research Network of
Computational and Structural Biotechnology 2020.

(IIITDM, Kurnool) Design Project - 2023 November 30th, 2023


THANK YOU

You might also like