Professional Documents
Culture Documents
net/publication/329982473
CITATIONS READS
0 553
6 authors, including:
Some of the authors of this publication are also working on these related projects:
Exploring the Disease-Drug associations through medical big data View project
All content following this page was uploaded by Md. Mohaimenul Islam on 09 January 2019.
a r t i c l e i n f o a b s t r a c t
Article history: Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated
Received 31 October 2018 with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity
Revised 21 December 2018
to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a
Accepted 28 December 2018
machine learning model to predict FLD that could assist physicians in classifying high-risk patients and
make a novel diagnosis, prevent and manage FLD.
Keywords: Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital
Fatty liver disease between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB),
Machine learning artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area
Classification model
under the receiver operating characteristic curve (ROC) was used to evaluate performances among the
Random forest
four models.
Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area
under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation
was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48,
82.65, 81.85, and 76.96%.
Conclusion: In this study, we developed and compared the four classification models to predict fatty liver
disease accurately. However, the random forest model showed higher performance than other classifica-
tion models. Implementation of a random forest model in the clinical setting could help physicians to
stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.
© 2018 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.cmpb.2018.12.032
0169-2607/© 2018 Elsevier B.V. All rights reserved.
24 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29
Fig. 2. Feature selection. Information gain ranking was used to evaluate the worth of each variable by measuring the entropy gain with respect to the outcome, and then
rank attributes by their individual evaluations (left to right).
gistic regression (LR) was used to developed prediction models. We in the study input data is associated b coefficient (a constant real
considered these four models due to their following characteristic. value) that is learned from training data.
Random forest (RF) is an ensemble classification algorithm that
is composed of a multitude of decision trees developed by Leo
2.3.4. Cross validation
Breiman and Adele Culter in 1999 [12]. The tree is built indepen-
We assessed the performance and general error of entire clas-
dently by applying the general technique of bootstrap aggregating
sification models by using stratified k-fold cross-validation (Fig.1).
(i.e. bagging) and is randomly selected sample for the training set.
This is widely used and preferred validation technique in machine
The final result is determined by a simple majority vote of all trees.
learning due to differ from the conventional split sample approach.
RF has proven to be a highly accurate algorithm in various fields
This approach helped to: 1) reduces the variance in prediction er-
including medical diagnosis.
ror; 2) maximizes the use of data for both training and valida-
Artificial neural networks (ANN) are computational models that
tion, without overfitting or overlap between the test and valida-
emulate the biological neural networks. It is very powerful non-
tion data; and 3) guards against testing hypothesis suggested by
linear modelling which is already proven for accurate predictions
arbitrarily split data. The dataset was randomly divided into equal
in many CDS [13]. This model consists of a number of artificial
k-fold (3, 5, 10) with approximately the same number of events. In
neural units called “perceptron” [14]. ANN is quite similar to the
this process, one-fold used as the validation set, and the remain-
biological neural cell where the signal is transmitted into neuron
ing folds as the training set. Therefore, each fold was used once for
through dendrite. It simulates the signal transmission through an
testing and training. The validation results from k (3, 5, 10) exper-
input layer to several hidden layers, and finally an output layer.
imental models were then combined to provide a measure of the
However, each layer comprises many perceptron, and the percep-
overall performance.
tron between layers are connected by different weights that can
be adjusted in training the algorithms. In this repetition process,
it automatically learns from the training dataset with a number of 2.4. Statistical analysis
samples until each input matches to corrected output in order to
achieve the best prediction. Continuous variables were presented as the mean ± standard
Naive Bayes is a generative model that makes dealing with deviation or median which is analyzed by unpaired t-test. Categor-
missing values a lot easier [15]. It is a classification model which ical variables were presented as absolute (n), and relative (%) fre-
predicts a class label y given a feature vector x = [x1 , x2 , x3 …xd ]T quency that was analyzed by chi-square test or Fisher’s exact test,
and helps to make an inference on a new sample xnew = [x1 , x2 , as appropriate. The performance of classification models to pre-
x3 …xd ]T with a missing feature xm . It is yet very powerful model dict fatty liver prediction was measured by the receiver-operating
that is used to return not the prediction but also the degree of cer- curve. We also calculated the accuracy (AC), sensitivity (SN), speci-
tainty. It is very easy to understand and implement. ficity (SP) with 95% confidence interval. R software (Version 3.4.2)
Logistics regression (LR) is one of the discrete choice mod- and Weka (V.3.9) was used to construct a model by using clas-
els which belongs to multivariate analysis. It is widely and most sification models [17]. Weka contains a collection of visualization
commonly-used method of empirical analysis in sociology, bio- tools and graphical user interface for easily performing algorithms.
statistics, clinical medicine, quantitative psychology, econometrics,
marketing, and often uses to compare with machine learning stud-
ies [16]. It has many advantage including high power and accuracy. 2.5. Model assessment
The equation of logistic regression:
The confuse matrix was used to determine the relationship be-
tween the actual values and predicted values [18]. Table 1 shows
e(b0 +b1 ∗x )
y= . the structure of confusion matrix.
(1 − e(b0 +b1 ∗x)
Accuracy: Model accuracy defines as the total positive instances of
the model are divided by the total number of instances. Accuracy
Here y is the predicted output, b0 is the bias or intercept term
parameter provides the percentage of correctly classified instances.
and b1 is the coefficient for the single input value (x). Each column
26 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29
Table 2
Demographic characteristics of participants.
Age 0.001
Mean (SD) 54.1 (12.6) 49.4 (15.2)
Gender, N (%) <0.0 0 01
Male 207 (54.9) 66 (33)
Systolic blood pressure (mmHg) 130.2 (18.8) 119.5 (17.1) 0.203
Diastolic blood pressure (mmHg) 80.1 (11.2) 74.7 (11.1) 0.638
Abdominal Girdle 85.8 (11.2) 73.5 (7.4) 0.001
Triglyceride (mg/dL) 146 (83.8) 87.9 (44.8) <0.0 0 01
HDL-C (mg/dL) 50.9 (13.1) 64.7 (15.4) 0.037
Glucose AC (mg/dL) 105.4 (28.3) 93.9 (14.4) <0.0 0 01
GOT-AST 29.4 (15.2) 24.3 (11.2) 0.003
GPT-ALT 35.7 (24.6) 20.6 (14.1) <0.0 0 01
Table 3
Summary of four classification models with 3, 5, 10 cross-validation.
Model 3 fold cross validation 5 fold cross validation 10 fold cross validation
AUROC AC (%) SN (95% CI) SP (95% CI) AUROC AC SN (95% CI) SP (95% CI) AUROC AC SN (95% CI) SP (95% CI)
RF 0.915 84.29 85.32 (81.24–88.81) 83.41 (79.48–86.86) 0.922 86.35 86.92 (83.04–90.20) 85.85 (82.10–89.08) 0.925 86.48 87.16 (83.29 −90.41) 85.89 (82.14–89.11)
LR 0.892 82.75 83.66 (79.43–87.32) 81.97 (77.93–85.55) 0.888 82.28 83.10 (78.83–86.82) 81.49 (77.42–85.11) 0.888 82.65 83.43 (79.19–87.11) 81.93 (77.88–85.51)
ANN 0.903 84.17 85.67 (81.60–89.14) 82.90 (78.95–86.37) 0.881 80.30 85.44 (81.06–89.14) 76.79 (72.66–80.57) 0.895 81.85 81.55 (77.24–85.35) 82.13 (78.04–85.75)
NB 0.856 76.70 83.79 (79.04–87.84) 72.65 (68.48–76.55) 0.852 76.70 84.27 (79.52–88.29) 72.30 (68.11–76.22) 0.854 76.96 84.38 (79.66–88.37) 72.60 (88.41–76.51)
Note: AC = Accuracy, SN = Sensitivity SP = Specificity, AUROC = Area under receiver operating curve.
Fig. 3. Receiver-Operating Characteristic curve for prediction of fatty liver. Random models showed better performance than other three classification models.
uals and nonalcoholic steatohepatitis (NASH) from simple steato- There are many kinds of machine learning algorithms have been
sis. In NAFLD discriminant score, 86.4% of original grouped cases developed along with the most popular Bayesian algorithm, it is
were correctly classified. Yip et al. [32] developed and validated hard to make a proper algorithm for clinical decision making and
a laboratory parameter-based machine learning model to detect clinical practices [33]. Therefore, model performance along with
NAFLD for the general population. They randomly divided 922 sub- interpretation is considering for appropriate clinical decision. As
jects from a population screening study into training and valida- included models in our study, particularly random forest showed
tion groups, and 23 routine clinical and laboratory parameters af- better prediction so that it could effectively identify fatty liver dis-
ter elastic net regulation. However, their model achieved AUROC of ease (FLD) for anyone by initial screening without using abdominal
0.87 (95% CI 0.83–0.90) and 0.88 (0.84–0.91) in the training and ultrasonography. Additionally, this model would provide an easy,
validation groups respectively. The details of the parameters used fast, low cost, and non-invasive method to accurately diagnose FLD
in machine learning performance with other studies are provided [34]. A total of ten predictors that used to predict fatty liver dis-
in Table 4. ease might be considered as a robust and concise evidence.
28 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29
Table 4
Performance comparison between proposed model and others.
Author Year Country Source of data Fatty/ Non-fatty Validation method ACC (%) SEN (%) SPE (%) AUC (%)
Ma 2018 China Hospital (LT, PE, LU) 2522/7986 10-fold 82.92 67.5 87.8 N/A
Islam 2018 Taiwan Hospital (LT) 593/401 10-fold 70.70 74.1 64.90 76.30
Birjandi 2016 Iran Hospital (LT) 359/1241 N/A 80 74 83 78
Jamali 2016 Iran Hospital (LT) 54/54 N/A N/A 91 83 84.4
Yip 2017 Hong Kong Hospital (LT) 264/658 N/A N/A 92 90 87
Proposed 2018 Taiwan Hospital (LT) 377/200 10-fold 86.48 87.16 85.89 0.925
Note: LT = Laboratory test, PE = Physical examination, LU = Liver ultrasonography, N/A = Not applicable.
The healthcare data has been increasing day by day and ma- the global burden of FLD. Future studies are needed to validate our
chine learning allows massive amounts of data to be analyzed model to predict FLD in various types of dataset.
rapidly [35]. Therefore, it is an opportunity to apply machine learn-
ing models to the care of individual patients in medical practice. Compliance with ethical standards
Using appropriate machine learning prediction models, physicians
could be able to extract the minimum data necessary to make Conflict of interest
a therapeutic decision [36]. Our model has the potential to early
FLD detection that would assist to improve precise and appropri- None.
ate treatment pattern. It is very important for physicians to know
about the most predictive variables for the best treatment out- Ethical approval
come. Patient’s baseline characteristics might be the strongest pre-
dictors of FDL for evaluation of the individual patient level [37]. All procedures performed in studies involving human partici-
Therefore, we carefully adopted a feature selection strategy and pants were in accordance with the ethical standards of the institu-
used k-fold cross-validation to repeatedly screen potential vari- tional and/or national research committee and with the 1964 Dec-
ables. Data were included from a medical center EMR without ad- laration of Helsinki and its later amendments or comparable ethi-
ditional clinical assessments, and our high-performance prediction cal standards.
model could be easily integrated into EMR to identify FLD risk. Our
prediction model could help to identify FLD patients that might Informed consent
significantly impact on treatment pattern. Early prediction using
this model might bring benefits from treatment reduction, and None.
medical cost decrease.
Funding
4.1. Limitations
This work was financially supported by the Higher Education
This present study has several limitations that need to be ad- Sprout Project of the Ministry of Education (MOE) in Taiwan (TMU
dressed. First, we only collected data from one medical center. But, DP2-107-21121-01-A06).
multicenter dataset and external validation could have better per-
formance and more reliable. Additionally, validation of the derived Acknowledgment
risk score will be required in future. Second, we evaluated only
577 patient’s information that was considered as sample size al- We would like to thanks our colleague who is a Native English
though most of the variables were statistically significant. We also Speaker for editing our manuscript.
used k-fold cross-validation which is reliable for small data set and
Supplementary material
help to reduce significant errors. In this method, the data set are
selected randomly into ten groups, and all groups are used for
Supplementary material associated with this article can be
both training and validation [38]. It gives nearly unbiased estimates
found, in the online version, at doi:10.1016/j.cmpb.2018.12.032.
of the prediction error even if the data size is small [39]. Third,
only nine variables were used to predict fatty liver disease but it References
could assist physicians to take clinical decision precisely. Fourth,
we could not classify patients into fatty and non-fatty liver dis- [1] M. Lazo, J.M. Clark, in: The Epidemiology of Nonalcoholic Fatty Liver Disease:
ease patients due to data insufficiency. Patients BMI information A Global Perspective: Seminars in Liver Disease, 28, © Thieme Medical Pub-
lishers, 2008, pp. 339–350.
was not also included in our study. Because our electronic medi- [2] M.H. Le, P. Devaki, N.B. Ha, D.W. Jun, H.S. Te, R.C. Cheung, M.H. Nguyen, Preva-
cal record database does not contain this information. Finally, we lence of non-alcoholic fatty liver disease and risk factors for advanced fibrosis
used a classification approach for automatic ML variables integra- and mortality in the United States, PLoS One 12 (2017) e0173499.
[3] Q.M. Anstee, G. Targher, C.P. Day, Progression of NAFLD to diabetes mellitus,
tion, but deep learning approach could have been used to improve
cardiovascular disease or cirrhosis, Nat. Rev. Gastroenterol. Hepatol. 10 (2013)
better prediction. 330–344.
[4] M. Motwani, D. Dey, D.S. Berman, G. Germano, S. Achenbach, M.H. Al-Mallah,
D. Andreini, M.J. Budoff, F. Cademartiri, T.Q. Callister, Machine learning for pre-
5. Conclusion diction of all-cause mortality in patients with suspected coronary artery dis-
ease: a 5-year multicentre prospective registry analysis, Eur. Heart J. 38 (2016)
500–507.
The findings of this study show that machine learning classifi-
[5] Sani A. Machine Learning for Decision Making, Université de Lille 1, 2015,
cation model especially the random forest model accurately pre- [6] W. Raghupathi, V. Raghupathi, Big data analytics in healthcare: promise and
dicts fatty liver disease patient using minimum clinical variables. potential, Health Inf. Sci. Syst. 2 (2014) 3.
This method may lead to greater insights in the real world clinical [7] P. Groves, B. Kayyali, D. Knott, S.V. Kuiken, The ’Big Data’ Revolution in Health-
care: Accelerating Value and Innovation, 2016.
practice which would assist physicians to effectively identify FLD [8] A. Andrade, J.S. Silva, J. Santos, P. Belo-Soares, Classifier approaches for liver
for novel diagnosis, preventive and therapeutic purpose to mitigate steatosis using ultrasound images, Procedia Technol. 5 (2012) 763–770.
C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29 29
[9] R. Ribeiro, J. Sanches, Fatty liver characterization and classification by ul- [24] M.G. Sanal, Biomarkers in nonalcoholic fatty liver disease-the emperor has no
trasound, in: Iberian Conference on Pattern Recognition and Image Analysis, clothes? World J. Gastroenterol. 21 (2015) 3223.
Springer, 2009, pp. 354–361. [25] L. Castera, V. Vilgrain, P. Angulo, Noninvasive evaluation of NAFLD, Nat. Rev.
[10] M. Owjimehr, H. Danyali, M.S. Helfroush, A. Shakibafard, Staging of fatty liver Gastroenterol. Hepatol. 10 (2013) 666–675.
diseases based on hierarchical classification and feature fusion for back-scan— [26] Z.-w. Chen, L.-y. Chen, H.-l. Dai, J.-h. Chen, L.-z. Fang, Relationship between
converted ultrasound images, Ultrason. Imaging 39 (2017) 79–95. alanine aminotransferase levels and metabolic syndrome in nonalcoholic fatty
[11] G. Li, Y. Luo, W. Deng, X. Xu, A. Liu, E. Song, Computer aided diagnosis of liver disease, J. Zhejiang Univ.-Sci. B 9 (2008) 616–622.
fatty liver ultrasonic images based on support vector machine: engineering in [27] J.M. Clark, A.M. Diehl, Defining nonalcoholic fatty liver disease: implications
medicine and biology society, in: 2008 EMBS 2008 30th Annual International for epidemiologic studies, Gastroenterology 124 (2003) 248–250.
Conference of the IEEE, IEEE, 2008, pp. 4768–4771. [28] H. Ma, C.-f. Xu, Z. Shen, C.-h. Yu, Y.-m. Li, Application of machine learning tech-
[12] L. Breiman, in: Random Forests, 45, Machine learning, 2001, pp. 5–32. niques for clinical predictive modeling: a cross-sectional study on nonalcoholic
[13] M.C. Papadopoulos, P.M. Abel, D. Agranoff, A. Stich, E. Tarelli, B.A. Bell, fatty liver disease in China, BioMed Res. Int. 2018 (2018).
T. Planche, A. Loosemore, S. Saadoun, P. Wilkins, A novel and accurate diag- [29] M.M. Islam, C.C. Wu, T.N. Poly, H.C. Yang, Y.C. Li, Applications of machine learn-
nostic test for human African trypanosomiasis, Lancet 363 (2004) 1358–1363. ing in fatty live disease prediction, in: 40th Medical Informatics in Europe Con-
[14] F. Rosenblatt, The perceptron: a probabilistic model for information storage ference, MIE 2018, IOS Press, 2018, pp. 166–170.
and organization in the brain, Psychol. Rev. 65 (1958) 386. [30] M. Birjandi, S.M.T. Ayatollahi, S. Pourahmad, A.R. Safarpour, Prediction and di-
[15] I. Rish, An empirical study of the naive Bayes classifier: IJCAI 2001 workshop agnosis of non-alcoholic fatty liver disease (NAFLD) and identification of its as-
on empirical methods in artificial intelligence, IBM 3 (2001) 41–46. sociated factors using the classification tree method, Iran. Red Crescent Med. J.
[16] S. Dreiseitl, L. Ohno-Machado, Logistic regression and artificial neural network 18 (2016).
classification models: a methodology review, J. Biomed. Inform. 35 (2002) [31] R. Jamali, A. Arj, M. Razavizade, M.H. Aarabi, Prediction of nonalcoholic fatty
352–359. liver disease via a novel panel of serum adipokines, Medicine 95 (2016).
[17] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The [32] T.F. Yip, A. Ma, V.S. Wong, Y.K. Tse, H.Y. Chan, P.C. Yuen, G.H. Wong, Labo-
WEKA data mining software: an update, ACM SIGKDD Explor. Newslett. 11 ratory parameter-based machine learning model for excluding non-alcoholic
(2009) 10–18. fatty liver disease (NAFLD) in the general population, Aliment. Pharmacol.
[18] R. Kohavi, F. Provost, Glossary of terms, Mach. Learn. 30 (1998) 271–274. Therapeutics 46 (2017) 447–456.
[19] A.K. Loomis, S. Kabadi, D. Preiss, C. Hyde, V. Bonato, M. St. Louis, J. Desai, [33] J. Wu, J. Roy, W.F. Stewart, Prediction modeling using EHR data: challenges,
J.M. Gill, P. Welsh, D. Waterworth, Body mass index and risk of nonalcoholic strategies, and a comparison of machine learning approaches, Med. Care 48
fatty liver disease: two electronic health record prospective studies, J. Clin. En- (2010) S106–S113.
docrinol. Metab. 101 (2016) 945–952. [34] J. Kang, T. Lee, I. Yap, K. Lun, Analysis of cost-effectiveness of different strate-
[20] Q. Pang, J.-Y. Zhang, S.-D. Song, K. Qu, X.-S. Xu, S.-S. Liu, C. Liu, Central obesity gies for hepatocellular carcinoma screening in hepatitis B virus carriers, J. Gas-
and nonalcoholic fatty liver disease risk after adjusting for body mass index, troenterol. Hepatol. 7 (1992) 463–468.
World J. Gastroenterol. 21 (2015) 1650. [35] T. Condie, P. Mineiro, N. Polyzotis, M. Weimer, Machine learning on big data:
[21] Y.-C. Lin, S.-C. Chou, P.-T. Huang, H.-Y. Chiou, Risk factors and predictors of data engineering (ICDE), in: 2013 IEEE 29th International Conference on, IEEE,
non-alcoholic fatty liver disease in Taiwan, Ann. Hepatol. 10 (2011) 125–132. 2013, pp. 1242–1244.
[22] G. Marchesini, S. Avagnina, E. Barantani, A. Ciccarone, F. Corica, E. Dall’Aglio, [36] T.B. Murdoch, A.S. Detsky, The inevitable application of big data to health care,
R. Dalle Grave, P. Morpurgo, F. Tomasi, E. Vitacolonna, Aminotransferase and JAMA 309 (2013) 1351–1352.
gamma-glutamyltranspeptidase levels in obesity are associated with insulin re- [37] G.K. Savova, P.V. Ogren, P.H. Duffy, J.D. Buntrock, C.G. Chute, Mayo clinic NLP
sistance and the metabolic syndrome, J. Endocrinol. Invest. 28 (2005) 333–339. system for patient smoking status identification, J. Am. Med. Inform. Assoc. 15
[23] R.K. Schindhelm, M. Diamant, J.M. Dekker, M.E. Tushuizen, T. Teerlink, (2008) 25–28.
R.J. Heine, Alanine aminotransferase as a marker of non-alcoholic fatty liver [38] G. McLachlan, K.-A. Do, C. Ambroise, Analyzing Microarray Gene Expression
disease in relation to type 2 diabetes mellitus and cardiovascular disease, Dia- Data, John Wiley & Sons, 2005.
betes Metab. Res. Rev. 22 (2006) 437–443. [39] B. Efron, Estimating the error rate of a prediction rule: improvement on cross–
validation, J. Am. Statist. Assoc. 78 (1983) 316–331.