You are on page 1of 5

This full text paper was peer-reviewed at the direction of IEEE Instrumentation and Measurement Society prior to the

acceptance and publication.

A Machine Learning Approach for Grading Autism


Severity Levels Using Task-based Functional MRI
Reem Haweel1,2 , Omar Dekhil1 , Ahmed Shalaby1 , Ali Mahmoud1 ,
Mohammed Ghazal1,3 , Robert Keynton1 , Gregory Barnes4 , and Ayman El-Baz1
1
Bioengineering Department, University of Louisville, Louisville, KY, USA.
2
Faculty of Computer and Information Sciences, University of Ain Shams, Cairo, Egypt.
3
Department of Electrical and Computer Engineering, Abu Dhabi University, Abu Dhabi
4
Department of Neurology, University of Louisville, Louisville, KY, USA.

Abstract—Autism is a developmental disorder associated with is applied in response to auditory and language activities
difficulties in communication and social interaction. Autism to resolve frequently activated brain areas of both classes
diagnostic observation schedule (ADOS) is considered the gold of ASD and healthy controls. To analyze extreme repetitive
standard in autism diagnosis, which estimates a score explaining
the severity level for each individual. Currently, brain image behaviour, the conical HRF and SPM2 packages are used in
modalities are being investigated for the development of objec- [11]. Authors showed that the prefrontal cortex (specifically
tive technologies to diagnose Autism spectrum disorder (ASD). , the left inferior parietal regions and the right prefrontal-
Alterations in functional activity is believed to be important premotor) is hyper-activated in autism subjects. Lombardo
in explaining autism causative factors. This paper presents a et al. [12] conducted a study on three classes of normally
machine learning approach for grading severity level of the
autistic subjects using task-based functional MRI data. The local developing babies, good-outcome ASD infants, and relatively
features related to the functional activity of the brain is obtained lower-outcome languages. Results showed the same activation
from a speech experiment. According to ADOS reports, the in the first two classes in the language sensitive higher
adopted dataset is classified to three groups: mild, moderate temporal cortices, but for those with lower language outcome
and severe. Our analysis is divided into two parts: (i) individual less activation was observed than neurotypics.
subject analysis and (ii) higher level group analysis. We use the
individual analysis to extract the features used in classification, Nowadays, modalities of brain imaging are a powerful tool
while the higher level analysis is used to infer the statistical for diagnosing autism. Structural MRI and resting state fMRI
differences between groups. The obtained classification accuracy are the commonly incorporated image modalities adopted for
is 78% using the random forest classifier. classification systems. While task fMRI offers predictive data
Index Terms—Autism; Task-based functional MRI; Random on brain functional impairment and biomarkers, there were
Forest
few literature attempts to identify using task fMRI. Chanel
et al. applied multivariate pattern analysis (MVPA) on two
I. I NTRODUCTION
different fMRI experiments with social stimuli [13]. They
Autism spectrum disorder (ASD) is a neuronal develop- adopted support vector machines (SVMs) and recursive feature
mental disorder associated with a range of symptoms that elimination (RFE) to classify between 15 ASD and 14 control
differ in severity as social, sensorimotor, and communicative subjects, and accuracy ranged from 69% to 92.3%. In [14],
deficits [1]–[3]. ASD is diagnosed at the age of three, but another trial demonstrated the efficacy of various methods
some characteristics can be noticed as early as 12 months, to train generalizable recurrent neural networks from small
Especially with the progress of medical imaging and the latest datasets to classify children with ASD versus typical control
state-of-the-art machine learning approaches [4], [5]. Different subjects from task-based fMRI scans. The accuracy of the
modalities such as, structural magnetic resonance imaging classification ranged from 51.8% to 69.8%.
(sMRI), functional magnetic resonance imaging (fMRI) and Previous studies focused on analyzing or diagnosing autism
diffusion tensor imaging (DTI), are widely used for analysing disorder in two ASD or normal groups, which is not suffi-
brain’s structural and functional characteristics [6], [7]. cient to address differences across the wide autism spectrum
Task-based fMRI is commonly used in all brain regions between subjects. While [13] applied experiments on adults
to answer evoked blood oxygen-dependent (BOLD) signals between the ages of 19 and 53 and subject age in [14]is
in response to certain tasks within a range of different task (6.05±1.24 years), we include children between the ages of
domains [8]. Basic fMRI tasks include engine tasks, visual 12 and 27 months in our study for the sake of early detection
processing tasks, auditory and language tasks, and fundamen- of ASD.
tal social processing tasks [9]. The objective of this work is to use the brain processing
Literature research used the general linear model (GLM) and analysis tools in addition to the modern machine learning
[10] to systematically analyze the hemodynamic effects asso- algorithms for building more objective computer aided diag-
ciated with task-control. In recent literature, group assessment nosis (CAD). Resulting in a robust early treatment plan for

978-1-7281-3868-8/19/$31.00 ©2019 IEEE


each ASD patient individually having different level of autism By adding the second level of analysis on the groups level we
severity, i.e. the concept of personalized medicine. have:
β = XG BG + η (3)
II. M ATERIALS AND M ETHODS
Fig 1 explains the general machine learning framework to Where XG is the group-level design matrix, and it separates
grade the autism severity level in each subject into mild, the two groups (controls and patients), βG is the vector of the
moderate or severe with the analysis of task-based fMRI. group level parameters and η is the group level residuals. Also
The calibrated severity scores (CSS) for the toddler module, η is considered to have zero mean, E(η) = 0 Substituting with
obtained from raw total domain scores of the autism diagnostic (3) in (1) we get:
observation schedule (ADOS) [15], varies from 0 to 10. In Y = XXG βG + γ (4)
order to represent three grades, the CSS are divided to 3
classes: (i) mild (CSS: 1-4), (ii) moderate (CSS: 5-7), and (iii) Where
severe (CSS: 8-10). Three matched groups are included in this γ = Xη +  (5)
proposed study. Each group includes 10 subjects from corre-
Also γ has zero mean, E(γ) = 0. The covariance of γ:
sponding class. The task fMRI images are recorded in a speech
cov(γ) = w = XVG X T + V , where V is the covariance
experiment that includes three audio stimuli: simple forward
of  and VG is the covariance of η
speech, complex forward speech and backward speech, along
Using the general least squares approach [22], the first level
6 minutes 20 seconds. Fig 2 illustrates the speech experiment
parameters could be estimated as:
used to record the fMRI scans.
β̂ = (X T V −1 X)X T V −1 Y (6)
A. Data preprocessing
To perform the preprocessing pipeline we have used the and
ˆ = (X T V −1 X)−1 .
cov(β) (7)
fMRI expert analysis tool (FEAT) [16], [17] as part of the
fMRIB’s software library (FSL) [18]. The pipline consists of Using the same approach but on the group level, the group
the following steps: parameters are given by:
1) Slice timing correction with interleaved order, to restore
volume slices order after the effect of recording at βˆG = (XG
T −1
VG XG )−1 X T V −1 Y (8)
different time points [19]. and
2) Motion correction using MCFLIRT [20], to remove the cov(βˆG ) = (XG
T −1
VG XG )−1 . (9)
effect of motion during scanning by applying rigid-body
transformations with 12 degrees of freedom (DOF). It is important to check which voxels are significant with
3) Spatial smoothing using gaussian window applying full respect to some contrasts and conditions, after estimating the
width at half maximum (FWHM) of 5mm. parameters for each voxel. The most common calculation for
4) High pass temporal filtering (100s), to remove scanner testing the voxel significance is the paired z-test [23], [24].The
drifts and low frequency artefacts. reader is referred to [23] for more details about GLM.
5) Brain extraction using BET, to remove skull from the In this work, we perform first level analysis and higher level
sMRI scan. group analysis to model the three regressors in GLM corre-
6) Two steps registration. First, register the functional vol- sponding to the three audio stimuli in the speech experiment.
ume to its high resolution sMRI scan. second, register The first level analysis is applied to get activation information
the sMRI scan to MNI-152 space with 12 DOF [21]. about each brain voxel, as a preliminary step to extract our
features for classification. The higher level analysis is applied
B. Multi Level Generalized Linear Model (GLM) to examine the significant activated voxels in each group,
Consider an experiment with N subjects, and for each together with the overall group differences. Such analysis gives
subject K there is a vector of T time points, Yk . The first us insightful information about the existence of significant
level GLM is defined as: common activations or differences between the three groups to
support the liability to apply machine learning classification.
Y = Xβ +  (1)
C. Feature Extraction and Selection
Where Xk is the design matrix, βk are the parameters esti-
mates, and k is the error term (subject residuals). The residual We have examined different features to test which one is
is assumed to have zero mean E(k ) = 0. Putting equation 1 more discriminating. First, we examine both: the mean of all
in its matrix form it becomes: three GLM parameter estimates in accordance to the three
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ audio stimuli, and the mean z-stat of these regressors. The
X1 0 . . . 0 0 β1 1 Brainnetome atlas (BNT) [25] is used to map the brain to
⎢ 0 X2 . . . 0 ⎥ ⎢ β2 ⎥ ⎢  2 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 246 areas. Second, in order to represent these features for
Y =⎢ . .. .. .. .. ⎥ ∗ ⎢ .. ⎥ + ⎢ .. ⎥ (2)
⎣ .. . . . . ⎦ ⎣ . ⎦ ⎣ . ⎦ each brain area, we have applied different calculations across
0 0 ... xN βN N all voxels in each brain area. To represent the mean of the
Fig. 1. Block diagram of the proposed framework for grading autistic subject’s severity level using task-based fMRI images

Fig. 2. Representation of speech experiment with three audio stimuli: Complex forward speech, Simple forward speech, backward speech which are alternating
repeatedly along 6 min 20 sec.

parameter estimates for each area, we have tested: mean and sets randomly. Features are sorted then the least important
standard deviation, average positive and negative values, the features are removed recursively. The RFE is run over hundred
count of scaled values above a suitable defined threshold. To iterations and the frequency of selection by each classifier for
represent the z-stat for each area, a histogram with 6 bins is each feature is calculated. Then the features with the higher
calculated to define the percent of z-stat intensity value within selection frequency are selected for classification.
each interval. Given that the z-stat map varies between 0 and
6, the used intervals are (0 <= |z| < 1, 1 <= |z| < 2, 2 <= III. E XPERIMENTAL R ESULTS AND D ISCUSSION
|z| < 3, 3 <= |z| < 4, 4 <= |z| < 5, |z| >= 5). As a result, The included dataset in this study is retrieved from
we have created different feature vectors of the 246 areas, each ”Biomarkers of Autism at 12 Months: From Brain Overgrowth
with 6 or 2 features. Each feature vector is to be examined to Genes” dataset at the National Database for Autism Re-
for feature selection and classification. search (NDAR: http://ndar.nih.gov). Subject selection criteria
To reduce the number of features and dimensionality and is based on the availability of ADOS toddler module reports.
to detect the significant features we have applied feature 30 subjects were included (10 mild, 10 moderate and 10
selection algorithm prior to classification. recursive feature severe) with ages ranging from 12 to 27 months (mean :
elimination (RFE) [26] is the algorithm applied for feature 20.13 months and std: 5.07 months) for early diagnosis. Each
selection in our study. Different versions of the random forest subject has a (T1) structural MRI and speech fMRI scans
classifier together with the support vector machine are chosen that are scanned using a Signa HDxt, 1.5T GE Healthcare
as the classifiers to fit the selection model for the data. scanner. Each fMRI scan consists of 154 volumes with TR=2.5
To avoid overfitting over the small dataset, the RFE is run seconds and TE=30 milli seconds. Each volume is constructed
in 4-folds cross validation experiment to extract and select by 31 slices. The slice acquisition pattern is the alternating in
important features across train sets and validated with the test the plus direction. The following reports the results of the
Fig. 3. (a): significant differences in activation between three contrasts. (b): activation pattern of mild, moderate, and severe.

TABLE I IV. C ONCLUSION AND F UTURE W ORK


T HE CLASSIFICATION ACCURACIES FOR EACH FEATURE VECTOR OF THE
GLM PARAMETERS MEAN (β) AND Z - STAT In this paper, a machine learning based framework for ASD
severity grading is introduced. An extensive examination of
RF SVM MLP several types of feature extraction and representation, feature
Mean and std per area (mean(β)) 0.78 0.57 0.71
Mean positive and negative (mean(β)) 0.61 0.41 0.61
selection, using recursive feature elimination, and classifica-
Scaled (mean(β) 0.71 0.51 0.71 tion algorithms have been tested. Random forest classifier
Histogram z-stat 0.69 0.54 0.69 outperformed other classifiers with accuracy of 78%. We have
also applied group analysis to study common group brain ac-
tivation and differences as well, to motivate the liability to use
statistical inference with haigher level analysis for each group machine learning over the included dataset. Our future work
and between all three groups, as well as, the classification will mainly include more subjects, different experiments and
accuracies. different brain modalities to implement more comprehensive
research on significant differences between autism subjects
A. Classification Results in correlation with ADOS reports to develop personalized
In this experiment, we examine three different classifiers fed diagnosis and early treatment plans.
with the selected features. Used classifiers are random forest, V. ACKNOWLEDGMENT
support vector machine and multi-layer perceptron (MLP),
The authors would like to thank ADNOC for supporting
with randomly selected hyperparameters, on multiple runs.
research.
A 4-folds cross validation technique is performed calculating
the classification accuracies. Table I shows the classification R EFERENCES
accuracy of each classifier on each feature type. Best achieved [1] D. G. Amaral, C. M. Schumann, and C. W. Nordahl, “Neuroanatomy of
accuracy is 78% by adopting the random forest classifier on autism,” Trends in Neurosciences, vol. 31, no. 3, pp. 137 – 145, 2008.
the brain areas’ mean and standard deviation of the GLM
parameters. [2] O. Dekhil, M. Ali, Y. El-Nakieb, A. Shalaby, A. Soliman, A. Switala,
A. Mahmoud, M. Ghazal, H. Hajjdiab, M. F. Casanova, A. Elmaghraby,
R. Keynton, A. El-Baz, and G. Barnes, “A personalized autism diagnosis
B. Higher Level Analysis cad system using a fusion of structural mri and resting-state functional
mri data,” Frontiers in Psychiatry, vol. 10, p. 392, 2019. [Online].
The higher level modeling is done using fMRIB’s local Available: https://www.frontiersin.org/article/10.3389/fpsyt.2019.00392
analysis of mixed effects (FLAME) [18]. Group analysis
reveals common activation patterns for each group. Fig 3 [3] O. Dekhil, M. Ali, A. Shalaby, A. Mahmoud, A. Switala, M. Ghazal,
H. Hajidiab, B. Garcia-Zapirain, A. Elmaghraby, R. Keynton, G. Barnes,
shows the significant activation pattern for each of the three and A. El-Baz, “Identifying personalized autism related impairments us-
groups. In addition, it shows the significant differences in ing resting functional mri and ados reports,” in Medical Image Comput-
activation between them with three contrasts (mild>moderate, ing and Computer Assisted Intervention – MICCAI 2018, A. F. Frangi,
J. A. Schnabel, C. Davatzikos, C. Alberola-López, and G. Fichtinger,
mild>severe, moderate>severe). Such results reveal the differ- Eds. Cham: Springer International Publishing, 2018, pp. 240–248.
ence in activation between voxels and groups which explains
the good classification results. [4] M. F. Casanova et al., Autism Imaging and Devices. CRC Press, 2017.
[5] J. S. S. Manuel F Casanova, Ayman El-Baz, in Autism Imaging and [20] M. Jenkinson, P. Bannister, M. Brady, and S. Smith, “Improved op-
Devices, 2017. timization for the robust and accurate linear registration and motion
correction of brain images,” Neuroimage, vol. 17, no. 2, pp. 825–841,
[6] M. M. Ismail, R. S. Keynton, M. M. Mostapha, A. H. ElTanboly, M. F. 2002.
Casanova, G. L. Gimel’farb, and A. El-Baz, “Studying autism spectrum
disorder with structural and diffusion magnetic resonance imaging: a [21] J. L. Lancaster, D. Tordesillas-Gutiérrez, M. Martinez, F. Salinas,
survey,” Frontiers in human neuroscience, vol. 10, p. 211, 2016. A. Evans, K. Zilles, J. C. Mazziotta, and P. T. Fox, “Bias between mni
and talairach coordinates analyzed using the icbm-152 brain template,”
[7] Y. ElNakieb, M. Nitzken, A. Shalaby, O. Dekhil, A. Mahmoud, Human Brain Mapping, vol. 28, no. 11, pp. 1194–1205, 2007. [Online].
A. Switala, A. Elmaghraby, R. Keynton, M. Ghazal, A. Khalil, Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/hbm.20345
G. Barnes, and A. El-Baz, “Towards personalized autism diagnosis:
Promising results,” in 2018 24th International Conference on Pattern [22] S. R. Searle, G. Casella, and C. E. McCulloch, Variance components.
Recognition (ICPR), Aug 2018, pp. 3862–3867. John Wiley & Sons, 2009, vol. 391.

[8] J. D. Van Horn, J. S. Grethe, P. Kostelec, J. B. Woodward, J. A. Aslam, [23] C. F. Beckmann, M. Jenkinson, and S. M. Smith, “General multilevel
D. Rus, D. Rockmore, and M. S. Gazzaniga, “The functional magnetic linear modeling for group analysis in fmri,” Neuroimage, vol. 20, no. 2,
resonance imaging data center (fmridc): the challenges and rewards of pp. 1052–1063, 2003.
large–scale databasing of neuroimaging studies,” Philosophical Trans-
actions of the Royal Society of London B: Biological Sciences, vol. 356,
no. 1412, pp. 1323–1339, 2001. [24] Y. Zang, T. Jiang, Y. Lu, Y. He, and L. Tian, “Regional homogeneity
approach to fmri data analysis,” Neuroimage, vol. 22, no. 1, pp. 394–400,
2004.
[9] M. F. Casanova, A. S. El-Baz, J. S. Suri et al., Imaging the brain in
autism. Springer, 2013.
[25] L. Fan, H. Li, J. Zhuo, Y. Zhang, J. Wang, L. Chen, Z. Yang, C. Chu,
S. Xie, A. R. Laird et al., “The human brainnetome atlas: a new brain
[10] K. J. Friston, A. P. Holmes, K. J. Worsley, J.-P. Poline, C. D. Frith, and
atlas based on connectional architecture,” Cerebral cortex, vol. 26, no. 8,
R. S. Frackowiak, “Statistical parametric maps in functional imaging:
pp. 3508–3526, 2016.
a general linear approach,” Human brain mapping, vol. 2, no. 4, pp.
189–210, 1994.
[26] P. M. Granitto, C. Furlanello, F. Biasioli, and F. Gasperi, “Recursive fea-
[11] M. Gomot, M. K. Belmonte, E. T. Bullmore, F. A. Bernard, and ture elimination with random forest for ptr-ms analysis of agroindustrial
S. Baron-Cohen, “Brain hyper-reactivity to auditory novel targets in products,” Chemometrics and Intelligent Laboratory Systems, vol. 83,
children with high-functioning autism,” Brain, vol. 131, no. 9, pp. 2479– no. 2, pp. 83–90, 2006.
2488, 2008.

[12] M. V. Lombardo, K. Pierce, L. T. Eyler, C. C. Barnes, C. Ahrens-


Barbeau, S. Solso, K. Campbell, and E. Courchesne, “Different func-
tional neural substrates for good and poor language outcome in autism,”
Neuron, vol. 86, no. 2, pp. 567–577, 2015.

[13] G. Chanel, S. Pichon, L. Conty, S. Berthoz, C. Chevallier, and J. Grèzes,


“Classification of autistic individuals and controls using cross-task
characterization of fmri activity,” NeuroImage: Clinical, vol. 10, pp.
78–88, 2016.

[14] N. C. Dvornek, D. Yang, P. Ventola, and J. S. Duncan, “Learning gen-


eralizable recurrent neural networks from small task-fmri datasets,” in
International Conference on Medical Image Computing and Computer-
Assisted Intervention. Springer, 2018, pp. 329–337.

[15] K. Gotham, A. Pickles, and C. Lord, “Standardizing ados scores for a


measure of severity in autism spectrum disorders,” Journal of autism
and developmental disorders, vol. 39, no. 5, pp. 693–705, 2009.

[16] O. Dekhil, M. Ismail, A. Shalaby, A. Switala, A. Elmaghraby, R. Keyn-


ton, G. Gimel’farb, G. Barnes, and A. El-Baz, “A novel cad system
for autism diagnosis using structural and functional mri,” in 2017 IEEE
14th International Symposium on Biomedical Imaging (ISBI 2017), April
2017, pp. 995–998.

[17] M. W. Woolrich, B. D. Ripley, M. Brady, and S. M. Smith, “Temporal


autocorrelation in univariate linear modeling of fmri data,” Neuroimage,
vol. 14, no. 6, pp. 1370–1386, 2001.

[18] M. Jenkinson, C. F. Beckmann, T. E. Behrens, M. W. Woolrich, and


S. M. Smith, “Fsl,” Neuroimage, vol. 62, no. 2, pp. 782–790, 2012.

[19] O. Dekhil, H. Hajjdiab, A. Shalaby, M. T. Ali, B. Ayinde, A. Switala,


A. Elshamekh, M. Ghazal, R. Keynton, G. Barnes, and A. El-Baz,
“Using resting state functional mri to build a personalized autism
diagnosis system,” PLOS ONE, vol. 13, no. 10, pp. 1–22, 10 2018.
[Online]. Available: https://doi.org/10.1371/journal.pone.0206351

You might also like