Professional Documents
Culture Documents
1 Introduction
The symptoms and signs of ASD appear mainly when they are children. It is a life-long disease [1]. A study
found that other than ASD almost 33% of children with difficulties have some ASD symptoms while not
fulfilling the classification criteria[24]. With the increase in the number of ASD cases worldwide, and the time
and money involved in diagnosing a patient, ASD impacts economic growth. Early detection of ASD helps by
providing early proper therapy and/or medication needed.On the other hand, the traditional clinical methods
like, Autism Diagnostic Interview Revised (ADIR) and Autism Diagnostic Observation Schedule Revised
(ADOS-R), are time consuming and cumbersome[12,13].
The young child who has delayed speech issues roughly scores 25% of the total ADI-R items because for the
patient the verbal sections cannot be answered correctly. Besides, ADI-R takes 90-150 min for conducting the
interviews. The detection of ASD by ADOS-R depends on measurements of the scoring of the answers given
along with the disadvantages of over-classifying children having other clinical disorders [5]. So, they tried to
find a solution that is less time-consuming and allows easy detection of disorders. Now-a-days, machine
learning has been applied to detect various diseases including depression [11] and ASD [1,6,16]. The main
contributions of this paper can be summarised as follows:
1. They analyze the attributes of Child, Toddler Adolescent and Adult ASD datasets, and identify
associations between the demographic information and ASD cases.
2. They explore benchmark feature selection methods and identify the method that performs best for all
four ASD datasets.
3. They compare state-of-the-art classification techniques and identify the best performing classifier for
all four ASD datasets [1].
2 Related work
A number of researches were made that adopted machine learning techniques for the development of the
diagnosis process of ASD [1,6,16]. The fundamental purpose is to reduce the detection time and improve the
diagnosis process [18]. They can categorize the ASD detection study in two groups: Video clip-based study and
AQ-based study.
1
classification technique scores the highest accuracy among all other benchmark techniques for v1 adult ASD
dataset. Later McNamara et al. [14] also classify the same dataset by applying Decision Tree and random forest
classifiers. The comparison results between these two classifiers show that the random forest results more
accuracy for version-1 adult ASD dataset. Another author Hossain et al. [10] found that the sequential minimal
optimization (SMO) classifier performs best on detecting child ASD cases. Only Thabtah et al. [23] used v2
three ASD datasets (Child, Adolescent and Adult) and have applied a rule-based machine learning technique for
classification. In contrast, the author of this paper has used all four v2 ASD datasets (Toddler, Child, Adolescent
and Adult) and applied 27 benchmark classification techniques to identify the best classifier. To enhance
classification accuracy, they further apply a feature engineering approach to identify a set of significant
features/attributes and use them in classification (instead of using all attributes).
3 Methodology
Figure 1 depicts an overview of our methodology for detecting ASD cases, briefly described below:
- Step 1: Data preprocessing and analysis. The ASD datasets [20] contain few records with missing values.Also
it contains some information that is not related to ASD. Thus, before applying classification, the datasets need to
be cleaned/preprocessed.
- Step 2: Apply benchmark classification techniques. In this paper, they applied 27 classification techniques to
all four ASD datasets and evaluated their performance in terms of accuracy and F measure through 10-fold cross
validation. Finally,authors select the top eight classifiers for further evaluation.
2
indicates that the individual does have ASD. However, for the toddler dataset, if
the total score≥ 4, it is considered that the subject has ASD.
Table 2 reports the number of ASD and non-ASD cases in the datasets. Figure 2a shows gender-wise class
distribution of the considered four ASD datasets. Here, the authors observe that the child and adolescent datasets
are balanced but toddler and adult datasets are not balanced.
3
Fig. 2: Analysis of toddler, child, adolescent and adult ASD datasets
4
Figure 3 depicts the distribution of ASD cases by various ethnicities for all four datasets. Table 3 shows that
there is a significant association between ethnicity and ASD cases for all four datasets (p < 05). The strength of
the association was found moderate for adolescent and adult whereas it was low for toddler and child.
F-Measure:The F-score (or F-measure) considers both the precision and the recall of the test to compute the
score. The traditional or balanced F-score(F1-score) is the harmonic mean of the precision and recall:
5
F Measure(F1)= 2* Precision / (Precision+Recall) (4)
6 Feature Engineering
The authors also have applied and compared five prominent feature selection methods such Information
gain,Chi-square test,Pearson Correlation,One-R and Releif F on four ASD datasets.
6
with the increment of the number of attributes. Therefore, they show that MLP and Logistic Regression (LR)
classifiers exhibit 100% accuracy for top ten attributes (which is the minimal number of attributes), MLPs
accuracy remains same (i.e, 100%) for toddler, child and adult datasets with the increment of the attributes,
whereas LR performance drops off after minimal attribute point. Thus, we argue that MLP outperforms among
all eight classifiers including LR.
7
Fig. 4: AQ-10 questions responses for ASD datasets
8 Conclusion
The ASD datasets of toddler, child, adolescent and adult are analyzed by the Authors. They applied the most
popular five feature selection methods to derive fewer features from ASD datasets and found that Relief F
feature selection method outperforms amongst others. They also find that MLP outperforms amongst all other
classifiers using their methodology and approach.
The Main limitation of this research is the small datasets.In future work, the goal of this paper’s author is to aim
at collecting large datasets and work with deep learning methods that integrate feature assessment and
classification together for improved performance.
8
References
1. Allison,C.,Auyeung,B.,Baron-Cohen,S.:Toward Brief “Red Flags” for Autism Screening: The Short Autism
Spectrum Quotient and the Short Quantitative Checklist in 1,000 Cases and 3,000 Controls.Tech.rep.(2012)
2. Baranwal,A.,Vanitha,M.:Autistic spectrum disorder screening: Prediction With Machine Learning
Models.2020, International Conference Emerging Trends inInformation Technology and Engineering(ic-ETITE)
pp. 1 -7 (2020)
3. Baron-Cohen,S.,Wheelwright,S.,Skinner,R.,Martin, J.,Clubley, E.: The Autism-Spectrum Quotient (AQ):
Evidence from Asperger Syndrome/High-Functioning Autism ,Males and Females, Scientists and
Mathematicians. Journal of Autism and Developmental Disorders 31(1), 5-17 (2001)
4. Basu,K.:Autism Detection in Adults Dataset. URL shorturl.at/ekyTW
5. Chawla, N.V.: Data Mining for Imbalanced Datasets: An Overview. In: Data Mining and Knowledge
Discovery Handbook, pp. 875-886. Springer US, Boston, MA(2009)
6. Duda, M., Ma, R., Haber, N., Wall, D.P.:Use Machine learning for behavioral distinction of autism and
ADHD. Translational Psychiatry 6(2), e732 -e732 (2016)
7. Erkan, U., Thanh, D.: Autism spectrum disorder detection with machine learning methods. Current Psychiatry
Reviews 15, 297-308 (2019)
8. Fischbach, G., Neuron, C.L.: The Simons Simplex Collection: a resource for identification of autism genetic
risk factors. Elsevier (2010)
9. Geschwind, D., Sowinski, J., Lord, C.: Letters to the Editor 463. The American Journal, 2001
10. Hossain, M.D., Kabir, M.A.: Detecting child autism using classification techniques. In: Proceedings of the
17th World Congress on Medical and Health informatics (MedInfo 2019), vol. 264, pp. 1447-1448. IOS(2019)
11. Islam,M.R.,Kabir,M.A.,Ahhmed, A., Kamal, A.R.M., Wang, H., Ulhaq, A.: Depression Detection from
social network data using machine learning techniques. Health information science and systems 6(1), 8 (2018)
12. Le Couteur, A.L., Gottesman, I., Bolton, P., Simono , E., Yuzda, E., Rutter, M., Bailey, A.: Autism as a
strongly genetic disorder evidence from a british twin Study.Psychological Medicine 25(1), 63-77 (1995)
13. Lord, C., Rutter, M., Le Couteur, A.: Autism Diagnostic Interview-Revised: A revised version of a
diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of
Autism and Developmental Disorders 24(5), 659-685 (1994)
14.McNamara, B., Lora, C., Yang, D., Flores, F., Daly, P.: Machine Learning Classi cation of Adults with
Autism Spectrum Disorder (accessed June 25, 2020). URL shorturl.at/hEFJY
15. Parikh, M.N., Li, H., He, L.: Enhancing diagnosis of autism with optimized machine learning model and
sonal characteristic data. Frontiers in Computational Neuroscience 13 (2019)
16. Pratap, A., Kanimozhiselvi, C.S., Vijayakumar, R., Pramod,K.V.: Soft Computing Models for the Predictive
Grading Of Childhood Autism-A Comparative Study. Tech. Rep. 4 (2014)
17. Raj, S., Masood, S.: Analysis and detection of autism spectrum disorder using machine learning techniques.
Procedia Computer Science 167, 994-1004 (2020)
18. Thabtah, F.: Autism Spectrum Disorder Screening: Machine Learning Adaptation and DSM-5 Fulfillment.
dl.acm.org Part F129311, 16 (2017)
19. Thabtah, F.: ASD Tests. A mobile app for ASD screening. (accessed January 10, 2020). URL https://www.
asdtests.com/
20. Thabtah, F.: ASD Dataset (accessed January 13, 2020). URL https://fadifayez.com/autism-datasets/
21. Thabtah, F.: ASD Dataset- UCI machine learning repository, 2017 (accessed January 13, 2020). URL
https: //archive.ics.uci.edu/ml
22. Thabtah, F., Abdelhamid, N., Peebles, D.: A machine learning autism classification based on logistic
regression analysis. Health Information Science and Systems 7(1), 111 (2019)
23. Thabtah, F., Peebles, D.: A new machine learning model based on induction of rules for autism detection.
Health informatics journal 26(1), 264-286 (2020)
24. Wiggins, L.D., Reynolds, A., Rice, C.E., Moody, E.J., Bernal, P., Blaskey, L., Rosenberg, S.A., Lee, L.C.,
Levy, S.E.: Using Standardized Diagnostic Instruments to Classify Children with Autism in the Study to
Explore Early Development. Journal of Autism and Developmental Disorders 45(5), 1271-1280 (2015)
9
25. Zunino, A., Morerio, P., Cavallo, A., Ansuini, C., Podda, J., Battaglia, F., Veneselli, E., Becchio, C.,
Murino, V.: Video Gesture Analysis for Autism Spectrum Disorder Detection. In: 24th International Conference
on Pattern Recognition (ICPR), vol. 2018-August, pp. 34213426. IEEE (2018)
10