You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/329982473

Prediction of Fatty Liver Disease using Machine Learning Algorithms

Article  in  Computer methods and programs in biomedicine · March 2019


DOI: 10.1016/j.cmpb.2018.12.032

CITATIONS READS

0 553

6 authors, including:

Md. Mohaimenul Islam Chieh-Chen Wu


Taipei Medical University Taipei Medical University
47 PUBLICATIONS   125 CITATIONS    23 PUBLICATIONS   38 CITATIONS   

SEE PROFILE SEE PROFILE

Tahmina Nasrin Poly Phung Anh Nguyen


Taipei Medical University Case Western Reserve University School of Medicine
36 PUBLICATIONS   60 CITATIONS    48 PUBLICATIONS   419 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Exploring the Disease-Drug associations through medical big data View project

statin drugs and diabetes risk View project

All content following this page was uploaded by Md. Mohaimenul Islam on 09 January 2019.

The user has requested enhancement of the downloaded file.


Computer Methods and Programs in Biomedicine 170 (2019) 23–29

Contents lists available at ScienceDirect

Computer Methods and Programs in Biomedicine


journal homepage: www.elsevier.com/locate/cmpb

Prediction of fatty liver disease using machine learning algorithms


Chieh-Chen Wu a,e, Wen-Chun Yeh b, Wen-Ding Hsu c, Md. Mohaimenul Islam a,e,
Phung Anh (Alex) Nguyen e, Tahmina Nasrin Poly a,e, Yao-Chin Wang a,e,d, Hsuan-Chia Yang e,
Yu-Chuan (Jack) Li a,e,f,∗
a
Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology, Taipei Medical University, Taipei, Taiwan
b
Division of Hepatogastroenterology, Department of Internal Medicine, New Taipei City Hospital, Taiwan
c
Division of Nephrology, Department of Internal Medicine, New Taipei City Hospital, Taiwan
d
Department of Emergency, Min-Sheng General Hospital, Taoyuan, Taiwan
e
International Center for Health Information Technology(ICHIT), Taipei Medical University, Taipei, Taiwan
f
Department of Dermatology, Wan Fang Hospital, Taipei, Taiwan

a r t i c l e i n f o a b s t r a c t

Article history: Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated
Received 31 October 2018 with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity
Revised 21 December 2018
to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a
Accepted 28 December 2018
machine learning model to predict FLD that could assist physicians in classifying high-risk patients and
make a novel diagnosis, prevent and manage FLD.
Keywords: Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital
Fatty liver disease between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB),
Machine learning artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area
Classification model
under the receiver operating characteristic curve (ROC) was used to evaluate performances among the
Random forest
four models.
Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area
under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation
was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48,
82.65, 81.85, and 76.96%.
Conclusion: In this study, we developed and compared the four classification models to predict fatty liver
disease accurately. However, the random forest model showed higher performance than other classifica-
tion models. Implementation of a random forest model in the clinical setting could help physicians to
stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.
© 2018 Elsevier B.V. All rights reserved.

1. Introduction to stratify patients, and considered as a diagnostic reference stan-


dard for the assessment of fatty infiltration of the liver. However,
Fatty liver disease (FLD) is a common clinical problem; it is also this method is highly invasive and costly; it also might trigger side
associated with high morbidity and mortality. FLD eventually leads effects and sampling errors during the application of this method.
to noncholestatic cirrhosis and hepatocellular carcinoma [1]. Addi- Although, ultrasonography is using as a functional tool for FLD di-
tionally, FLD has been increasing in parallel with the prevalence of agnosis with higher accuracy, while identifying accuracy is highly
diabetes, metabolic syndrome and obesity [2]. Higher prevalence operator dependent [3].
of FLD has appeared as a greater economic burden. Therefore, ac- Machine learning (ML) is a field of computer science that
curate identification of individuals at risk and early recognition of uses computer algorithms to identify patterns in large data, and
FLD could offer immense benefits for diagnosis, preventive or even also assist to predict the various outcome based on data [4]. ML
proper treatment. Over the past decade, the biopsy has been used techniques have emerged as a potential tool for prediction and
decision-making in a multitude of disciples [5]. Due to the avail-

ability of clinical data, ML has been playing a critical role in med-
Corresponding author at: College of Medicine Science and Technology (CoMST),
Taipei Medical University, Department of Dermatology, Wan Fang Hospital, 250-
ical decision making as well [6,7]. Developing a machine learning
Wuxing Street, Xinyi District, Taipei 11031, Taiwan. model would serve as a valuable aid to identify disease and make
E-mail address: jack@tmu.edu.tw (Y.-C. (Jack) Li). a real-time effective clinical decision. It would also allow for op-

https://doi.org/10.1016/j.cmpb.2018.12.032
0169-2607/© 2018 Elsevier B.V. All rights reserved.
24 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29

timization of hospital resources by classifying right patients with


significant several risk factors earlier.
Nowadays, many studies have already been investigated medi-
cal imaging techniques such as ultrasound (US), computed tomog-
raphy (CT), and magnetic resonance imaging (MRI) for fatty liver
disease classification. Ultrasound imaging is noninvasive, inexpen-
sive, easy to operate, and portable. Andrade et al. [8] evaluated the
performance of three classifiers for diagnosis of liver steatosis, us-
ing several extracted features from ultrasound images. Ribeiro and
Sanches [9] utilized the anatomic and echogenic information of
normal liver and fatty liver ultrasonic images, and used Bayesian
framework on the extracted feature parameters for fatty liver di-
agnosis. Owjimehr et al. [10] demonstrated an automatic ROI se-
lection and hierarchical classification method to discriminate nor-
mal and three stages of fatty liver, steatosis, fibrosis, and cirrhosis.
Their algorithms discriminated the normal patients from fatty liver
patients in the first step by the use of wavelet packet transform
(WPT) features, and classified steatosis and the other stages of the
fatty liver in the second step by a fusion of WPT and gray-level dif-
ference statistical (GLCM) features. Moreover, Li et al. [11] analyzed
B-mode ultrasonic images texture features of fatty liver, composed
near-field light-spot density, near-far-field grayscale ratio, grayscale
co-occurrence matrix, and neighborhood gray-tone difference ma-
trix, and used support vector machine (SVM) as the classification
algorithm.
However, the diagnosis of fatty liver ultrasonic images varies Fig. 1. Machine learning pathway it involves automated feature selection by infor-
mation gain ranking building for classification model with k-fold cross validation.
due to use of different ultrasound equipment’s, poor quality of im-
ages, and physical differences of patients. However, a prediction
model based on available clinical variables would help clinicians to 2.3. Machine learning
correctly identify and make an actionable decision of prevention,
early diagnosis and targeted intervention. To date, the benefits of The primary objective of this study was, to select prognostic
utilizing classification models with data from the electronic medi- factors for predicting fatty liver disease using classification ma-
cal record to predict FDL have not been evaluated on a large scale. chine learning models. In this process, we divided our machine
We, therefore, aimed to construct a predicting model for fatty liver learning approach into four steps: 1. Data preprocessing: it in-
disease using the modern technique of machine learning, especially cludes data cleaning, resolves missing data, data transformation,
in classification approach. To our knowledge, this is the most com- and data imbalance reduction 2. Variables selection: a process of
prehensive study used machine learning models to predict FDL. selecting best subset of the relevant variable for use in model
building (help to reduce overfitting, improve accuracy, and reduce
2. Methods the training time). 3. Model building: select a suitable classification
for higher prediction 4. Cross-validation: to select entire dataset
2.1. Study population into two separate group (Kn − 1 :1) for training and testing. Fig. 1 il-
lustrates the ML process.
We collected data from New Taipei City Municipal Hospital Ban-
qiao Branch under a liver protection project. We included all pa- 2.3.1. Data preprocessing
tients who had received initial fatty liver screening in December A total of 577 patients were included in this study, of whom
2009. We excluded patients if they a) were ≤ 30 years, b) had an 377 patients were diagnosed with the fatty liver disease. As, the
incomplete examination process c) were suspected case of fatty data preprocessing is an important step in machine learning; we,
liver under ultrasonography test. This study was reviewed and ap- therefore removed all those variables that contained more than
proved by the institutional ethical committee board of Taipei Med- 50% missing value. In addition, data imputation and normalization
ical University and Taipei Medical University Hospital, conducted were performed to get a high-quality dataset. We also used Syn-
in accord with the ethical guidelines of the Declaration of Helsinki thetic Minority Over-Sampling Technique (SMOTE) method to gen-
of the world medical association. erate synthesis samples for the minority class and balance the pos-
itive and negative values of the training set.

2.2. Clinical data and outcomes 2.3.2. Variable selection


In this process, we assessed the weight of each variable by the
We collected all patient’s demographic, clinical data from the information gain ranking process. It helped to evaluate the effec-
electronic medical records at the time of screening. All fatty and tiveness of included variables in the training dataset. We included
non-fatty liver patients were identified by the abdominal ultra- only those variables into the final model building whose score was
sonography. Nine predictor variables were collected from both FLD > 0 in the information gain ranking (Fig. 2). We used forward se-
and Non-FLD patients and used those variables in our proposed lection model for variable reduction process in our current study.
models. Those variables were age, gender, systolic blood pressure
(SBP), diastolic blood pressure (DBP), abdominal girth (AG), glucose 2.3.3. Model building
AC, triglyceride, HDL-C, SGOT-AST, and SGPT-ALT. The classification The predictive classifier models were developed for accurately
models were used to identify FLD risk patients which would facil- identify FLD patients. The classification models such as random for-
itate personalized medicine in fatty liver patients in the future. est (RF), artificial neural network (ANN), Naïve Bayes (NB), and lo-
C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29 25

Fig. 2. Feature selection. Information gain ranking was used to evaluate the worth of each variable by measuring the entropy gain with respect to the outcome, and then
rank attributes by their individual evaluations (left to right).

gistic regression (LR) was used to developed prediction models. We in the study input data is associated b coefficient (a constant real
considered these four models due to their following characteristic. value) that is learned from training data.
Random forest (RF) is an ensemble classification algorithm that
is composed of a multitude of decision trees developed by Leo
2.3.4. Cross validation
Breiman and Adele Culter in 1999 [12]. The tree is built indepen-
We assessed the performance and general error of entire clas-
dently by applying the general technique of bootstrap aggregating
sification models by using stratified k-fold cross-validation (Fig.1).
(i.e. bagging) and is randomly selected sample for the training set.
This is widely used and preferred validation technique in machine
The final result is determined by a simple majority vote of all trees.
learning due to differ from the conventional split sample approach.
RF has proven to be a highly accurate algorithm in various fields
This approach helped to: 1) reduces the variance in prediction er-
including medical diagnosis.
ror; 2) maximizes the use of data for both training and valida-
Artificial neural networks (ANN) are computational models that
tion, without overfitting or overlap between the test and valida-
emulate the biological neural networks. It is very powerful non-
tion data; and 3) guards against testing hypothesis suggested by
linear modelling which is already proven for accurate predictions
arbitrarily split data. The dataset was randomly divided into equal
in many CDS [13]. This model consists of a number of artificial
k-fold (3, 5, 10) with approximately the same number of events. In
neural units called “perceptron” [14]. ANN is quite similar to the
this process, one-fold used as the validation set, and the remain-
biological neural cell where the signal is transmitted into neuron
ing folds as the training set. Therefore, each fold was used once for
through dendrite. It simulates the signal transmission through an
testing and training. The validation results from k (3, 5, 10) exper-
input layer to several hidden layers, and finally an output layer.
imental models were then combined to provide a measure of the
However, each layer comprises many perceptron, and the percep-
overall performance.
tron between layers are connected by different weights that can
be adjusted in training the algorithms. In this repetition process,
it automatically learns from the training dataset with a number of 2.4. Statistical analysis
samples until each input matches to corrected output in order to
achieve the best prediction. Continuous variables were presented as the mean ± standard
Naive Bayes is a generative model that makes dealing with deviation or median which is analyzed by unpaired t-test. Categor-
missing values a lot easier [15]. It is a classification model which ical variables were presented as absolute (n), and relative (%) fre-
predicts a class label y given a feature vector x = [x1 , x2 , x3 …xd ]T quency that was analyzed by chi-square test or Fisher’s exact test,
and helps to make an inference on a new sample xnew = [x1 , x2 , as appropriate. The performance of classification models to pre-
x3 …xd ]T with a missing feature xm . It is yet very powerful model dict fatty liver prediction was measured by the receiver-operating
that is used to return not the prediction but also the degree of cer- curve. We also calculated the accuracy (AC), sensitivity (SN), speci-
tainty. It is very easy to understand and implement. ficity (SP) with 95% confidence interval. R software (Version 3.4.2)
Logistics regression (LR) is one of the discrete choice mod- and Weka (V.3.9) was used to construct a model by using clas-
els which belongs to multivariate analysis. It is widely and most sification models [17]. Weka contains a collection of visualization
commonly-used method of empirical analysis in sociology, bio- tools and graphical user interface for easily performing algorithms.
statistics, clinical medicine, quantitative psychology, econometrics,
marketing, and often uses to compare with machine learning stud-
ies [16]. It has many advantage including high power and accuracy. 2.5. Model assessment
The equation of logistic regression:
The confuse matrix was used to determine the relationship be-
tween the actual values and predicted values [18]. Table 1 shows
e(b0 +b1 ∗x )
y= . the structure of confusion matrix.
(1 − e(b0 +b1 ∗x)
Accuracy: Model accuracy defines as the total positive instances of
the model are divided by the total number of instances. Accuracy
Here y is the predicted output, b0 is the bias or intercept term
parameter provides the percentage of correctly classified instances.
and b1 is the coefficient for the single input value (x). Each column
26 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29

Table 1 it sometimes drives adverse effects and waste resources. However,


Confusion matrix representation.
machine learning models always provide a significant insight com-
Positive Negative pared with traditional statistical models. We, therefore, developed
Predicted true (+) TP TN and evaluated a classification machine learning model to predict
Predicted false (−) FP FN FLD. Random forest model with 10-fold cross-validation showed
higher performance with C statistic 0.925. To our knowledge, this
is the first study attempted to predict FLD using various classifica-
The accuracy of model is defined as tion machine learning models. Although, implementation and eval-
uation of machine learning models have rapidly been increased in
TP + TN
Accuracy = recent years, a promising model has not been applied to predict
TP + FP + TN + FN FLD in routine clinical care. Hence, the performances of different
Sensitivity: Sensitivity is used to determine the degree of the at- models are the most important consideration, along with the easy
tribute to correctly classify the person with diseases and is defined to use and the interpretation of the models. Our finding suggests
as that random forest model would be best option to implement a
TP system for predicting fatty liver disease patients appropriately and
Sensitivity = effectively.
TP + FN
Application of machine model in analyzing the clinical variables
Specificity: Specificity is used to determine the degree of the at-
from electronic medical record is an efficient approach for discov-
tribute to correctly classify the person without diseases and is de-
ering the existing relationships among variables that is ordinar-
fined as-
ily difficult to detect. Random forest model has shown that it can
TN be exploited to extract implicit, useful, nontrivial associations even
Specificity =
TN + FP from factors that are not direct or explicit indicators of the class.
The sensitivity, specificity is also known as quality parameters However, early stage prediction of risk for developing fatty liver
and used to define the quality of the predicted class. To determine disease is not enough, and clinicians may also want to know the
the goodness of the medical diagnosis model basically three pa- main predictors that are responsible for developing fatty liver dis-
rameters are used, these three parameters are accuracy, sensitivity ease. In this study, we also ranked all predictors using information
and specificity. gain ranking; abdominal griddle was the most potential factor that
was followed by GPT_ALT, triglyceride, HDL_C, Glucose_AC. How-
3. Results ever, BMI was a potential risk factor that was also supported by
several studies [19,20]. Lin et al. revealed a 1.29 fold higher risk
3.1. Patient’s characteristics of fatty liver disease among patients with higher BMI [21]. Sev-
eral epidemiological studies reported that GPT is closely associated
We identified 700 participants who had received the initial with accumulation of fat in the liver disease [22,23]. However, age,
fatty liver screening in New Taipei City Municipal Hospital Banqiao sex, TG, ALT, GOT, GPT, AST/ALT ratio, total bilirubin, and fasting
Branch from 1st December to 31st December in 2009. A total of 22 blood glucose was found to be associated with FLD and had been
patients who were aged ≤ 30 years were excluded. In addition, 123 used in various diagnostic panels [24,25]. Additionally, a significant
patients were excluded due to incomplete examination and suspi- amount of studies described that FLD patients were asymptomatic,
cion of fatty liver by ultrasonography test. However, 577 patients and pointed out specific cause of fatty liver disease. They men-
who met all inclusion criteria were used for model development tioned that elevation of circulatory concentrations biomarkers such
(Supplementary Fig. 1). as serum glutamic oxaloacetic transaminase (SGOT), serum glu-
Demographic and clinical characteristics of overall 577 patients tamic pyruvic transaminase (SGPT)are mainly responsible for hep-
are summarized in Table 2. The age of patients with FLD was atic damage [26,27].
54.1 ± 12.6 years, and age of the non-FLD group was 49.4 ± 15.2) Recently, several studies have reported classification results to
years. There were 207 (54.9%), and 66 (33%) males for FLD and correctly identify fatty liver patients and non-fatty liver patients.
non-FLD groups (p < 0.0 0 01), respectively. The mean value of other Ma et al. [28] developed machine learning techniques to evalu-
variables in the FLD group was significantly higher than that of the ate the optimal predictive clinical model of NAFLD. Among the
non-FLD group except HDL-C, SBP, and DBP. 10,508 enrolled subjects, 2522 (24%) met the diagnostic criteria
of NAFLD. A 10-fold cross-validation was used in the classifica-
3.2. Model performance tion, and the Bayesian network model achieved the best perfor-
mance (accuracy: 82.92%, sensitivity: 0.67, specificity: 0.878, pre-
Table 3 shows the performance of classification models. cision: 0.636, and F-measure: 0.655) from among the 11 differ-
The area under ROC of RF with 3, 5 and 10 cross-validation was ent techniques. Moreover, Islam et al. [29] constructed four clas-
0.915, 0.922, and 0.925 respectively. In addition, the accuracy of sification models [Random Forest (RF), Support Vector Machine
RF, was 84.29, 86.35, and 86.48 respectively. AUROC was plotted to (SVM), Artificial Neural Network (ANN), and Logistic Regression
compare different classification models. Fig 3 summarized the ROC (RF)] to predict fatty liver disease; logistic regression technique
curves of four different models. provides a better result (Accuracy 76.30%, sensitivity 74.10%, and
specificity 64.90%) among all others machine learning algorithms.
4. Discussion Another classification model was developed by Birjandi et al. [30],
identifying the most important factors influencing NAFLD using a
The results of this study suggest that machine learning mod- classification tree (CT) to predict the probability of NAFLD. How-
els are well suited for meaningful prediction of FLD. The random ever, main potential variables for predicting NAFLD based on the
forest model showed better performance among other prediction CT was BMI, WHR, triglycerides, glucose, SBP, and alanine amino-
classification models. Fatty liver disease is a common complica- transferase, and model achieved a prediction accuracy 80% with
tion of critical illness associated with higher mortality and mor- the area under the receiver operating characteristic (ROC) curve
bidity. In recent years, traditional diagnostic and treatment plan 78%. Furthermore, Jamali et al. [31] developed a model based on
have been contributing to an improved understanding of FLD. But serum adipokines for discriminating NAFLD from healthy individ-
C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29 27

Table 2
Demographic characteristics of participants.

Variables Fatty liver n = 377 Non-fatty liver n = 200 p-value

Age 0.001
Mean (SD) 54.1 (12.6) 49.4 (15.2)
Gender, N (%) <0.0 0 01
Male 207 (54.9) 66 (33)
Systolic blood pressure (mmHg) 130.2 (18.8) 119.5 (17.1) 0.203
Diastolic blood pressure (mmHg) 80.1 (11.2) 74.7 (11.1) 0.638
Abdominal Girdle 85.8 (11.2) 73.5 (7.4) 0.001
Triglyceride (mg/dL) 146 (83.8) 87.9 (44.8) <0.0 0 01
HDL-C (mg/dL) 50.9 (13.1) 64.7 (15.4) 0.037
Glucose AC (mg/dL) 105.4 (28.3) 93.9 (14.4) <0.0 0 01
GOT-AST 29.4 (15.2) 24.3 (11.2) 0.003
GPT-ALT 35.7 (24.6) 20.6 (14.1) <0.0 0 01

Table 3
Summary of four classification models with 3, 5, 10 cross-validation.
Model 3 fold cross validation 5 fold cross validation 10 fold cross validation
AUROC AC (%) SN (95% CI) SP (95% CI) AUROC AC SN (95% CI) SP (95% CI) AUROC AC SN (95% CI) SP (95% CI)
RF 0.915 84.29 85.32 (81.24–88.81) 83.41 (79.48–86.86) 0.922 86.35 86.92 (83.04–90.20) 85.85 (82.10–89.08) 0.925 86.48 87.16 (83.29 −90.41) 85.89 (82.14–89.11)
LR 0.892 82.75 83.66 (79.43–87.32) 81.97 (77.93–85.55) 0.888 82.28 83.10 (78.83–86.82) 81.49 (77.42–85.11) 0.888 82.65 83.43 (79.19–87.11) 81.93 (77.88–85.51)
ANN 0.903 84.17 85.67 (81.60–89.14) 82.90 (78.95–86.37) 0.881 80.30 85.44 (81.06–89.14) 76.79 (72.66–80.57) 0.895 81.85 81.55 (77.24–85.35) 82.13 (78.04–85.75)
NB 0.856 76.70 83.79 (79.04–87.84) 72.65 (68.48–76.55) 0.852 76.70 84.27 (79.52–88.29) 72.30 (68.11–76.22) 0.854 76.96 84.38 (79.66–88.37) 72.60 (88.41–76.51)

Note: AC = Accuracy, SN = Sensitivity SP = Specificity, AUROC = Area under receiver operating curve.

Fig. 3. Receiver-Operating Characteristic curve for prediction of fatty liver. Random models showed better performance than other three classification models.

uals and nonalcoholic steatohepatitis (NASH) from simple steato- There are many kinds of machine learning algorithms have been
sis. In NAFLD discriminant score, 86.4% of original grouped cases developed along with the most popular Bayesian algorithm, it is
were correctly classified. Yip et al. [32] developed and validated hard to make a proper algorithm for clinical decision making and
a laboratory parameter-based machine learning model to detect clinical practices [33]. Therefore, model performance along with
NAFLD for the general population. They randomly divided 922 sub- interpretation is considering for appropriate clinical decision. As
jects from a population screening study into training and valida- included models in our study, particularly random forest showed
tion groups, and 23 routine clinical and laboratory parameters af- better prediction so that it could effectively identify fatty liver dis-
ter elastic net regulation. However, their model achieved AUROC of ease (FLD) for anyone by initial screening without using abdominal
0.87 (95% CI 0.83–0.90) and 0.88 (0.84–0.91) in the training and ultrasonography. Additionally, this model would provide an easy,
validation groups respectively. The details of the parameters used fast, low cost, and non-invasive method to accurately diagnose FLD
in machine learning performance with other studies are provided [34]. A total of ten predictors that used to predict fatty liver dis-
in Table 4. ease might be considered as a robust and concise evidence.
28 C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29

Table 4
Performance comparison between proposed model and others.

Author Year Country Source of data Fatty/ Non-fatty Validation method ACC (%) SEN (%) SPE (%) AUC (%)

Ma 2018 China Hospital (LT, PE, LU) 2522/7986 10-fold 82.92 67.5 87.8 N/A
Islam 2018 Taiwan Hospital (LT) 593/401 10-fold 70.70 74.1 64.90 76.30
Birjandi 2016 Iran Hospital (LT) 359/1241 N/A 80 74 83 78
Jamali 2016 Iran Hospital (LT) 54/54 N/A N/A 91 83 84.4
Yip 2017 Hong Kong Hospital (LT) 264/658 N/A N/A 92 90 87
Proposed 2018 Taiwan Hospital (LT) 377/200 10-fold 86.48 87.16 85.89 0.925

Note: LT = Laboratory test, PE = Physical examination, LU = Liver ultrasonography, N/A = Not applicable.

The healthcare data has been increasing day by day and ma- the global burden of FLD. Future studies are needed to validate our
chine learning allows massive amounts of data to be analyzed model to predict FLD in various types of dataset.
rapidly [35]. Therefore, it is an opportunity to apply machine learn-
ing models to the care of individual patients in medical practice. Compliance with ethical standards
Using appropriate machine learning prediction models, physicians
could be able to extract the minimum data necessary to make Conflict of interest
a therapeutic decision [36]. Our model has the potential to early
FLD detection that would assist to improve precise and appropri- None.
ate treatment pattern. It is very important for physicians to know
about the most predictive variables for the best treatment out- Ethical approval
come. Patient’s baseline characteristics might be the strongest pre-
dictors of FDL for evaluation of the individual patient level [37]. All procedures performed in studies involving human partici-
Therefore, we carefully adopted a feature selection strategy and pants were in accordance with the ethical standards of the institu-
used k-fold cross-validation to repeatedly screen potential vari- tional and/or national research committee and with the 1964 Dec-
ables. Data were included from a medical center EMR without ad- laration of Helsinki and its later amendments or comparable ethi-
ditional clinical assessments, and our high-performance prediction cal standards.
model could be easily integrated into EMR to identify FLD risk. Our
prediction model could help to identify FLD patients that might Informed consent
significantly impact on treatment pattern. Early prediction using
this model might bring benefits from treatment reduction, and None.
medical cost decrease.
Funding
4.1. Limitations
This work was financially supported by the Higher Education
This present study has several limitations that need to be ad- Sprout Project of the Ministry of Education (MOE) in Taiwan (TMU
dressed. First, we only collected data from one medical center. But, DP2-107-21121-01-A06).
multicenter dataset and external validation could have better per-
formance and more reliable. Additionally, validation of the derived Acknowledgment
risk score will be required in future. Second, we evaluated only
577 patient’s information that was considered as sample size al- We would like to thanks our colleague who is a Native English
though most of the variables were statistically significant. We also Speaker for editing our manuscript.
used k-fold cross-validation which is reliable for small data set and
Supplementary material
help to reduce significant errors. In this method, the data set are
selected randomly into ten groups, and all groups are used for
Supplementary material associated with this article can be
both training and validation [38]. It gives nearly unbiased estimates
found, in the online version, at doi:10.1016/j.cmpb.2018.12.032.
of the prediction error even if the data size is small [39]. Third,
only nine variables were used to predict fatty liver disease but it References
could assist physicians to take clinical decision precisely. Fourth,
we could not classify patients into fatty and non-fatty liver dis- [1] M. Lazo, J.M. Clark, in: The Epidemiology of Nonalcoholic Fatty Liver Disease:
ease patients due to data insufficiency. Patients BMI information A Global Perspective: Seminars in Liver Disease, 28, © Thieme Medical Pub-
lishers, 2008, pp. 339–350.
was not also included in our study. Because our electronic medi- [2] M.H. Le, P. Devaki, N.B. Ha, D.W. Jun, H.S. Te, R.C. Cheung, M.H. Nguyen, Preva-
cal record database does not contain this information. Finally, we lence of non-alcoholic fatty liver disease and risk factors for advanced fibrosis
used a classification approach for automatic ML variables integra- and mortality in the United States, PLoS One 12 (2017) e0173499.
[3] Q.M. Anstee, G. Targher, C.P. Day, Progression of NAFLD to diabetes mellitus,
tion, but deep learning approach could have been used to improve
cardiovascular disease or cirrhosis, Nat. Rev. Gastroenterol. Hepatol. 10 (2013)
better prediction. 330–344.
[4] M. Motwani, D. Dey, D.S. Berman, G. Germano, S. Achenbach, M.H. Al-Mallah,
D. Andreini, M.J. Budoff, F. Cademartiri, T.Q. Callister, Machine learning for pre-
5. Conclusion diction of all-cause mortality in patients with suspected coronary artery dis-
ease: a 5-year multicentre prospective registry analysis, Eur. Heart J. 38 (2016)
500–507.
The findings of this study show that machine learning classifi-
[5] Sani A. Machine Learning for Decision Making, Université de Lille 1, 2015,
cation model especially the random forest model accurately pre- [6] W. Raghupathi, V. Raghupathi, Big data analytics in healthcare: promise and
dicts fatty liver disease patient using minimum clinical variables. potential, Health Inf. Sci. Syst. 2 (2014) 3.
This method may lead to greater insights in the real world clinical [7] P. Groves, B. Kayyali, D. Knott, S.V. Kuiken, The ’Big Data’ Revolution in Health-
care: Accelerating Value and Innovation, 2016.
practice which would assist physicians to effectively identify FLD [8] A. Andrade, J.S. Silva, J. Santos, P. Belo-Soares, Classifier approaches for liver
for novel diagnosis, preventive and therapeutic purpose to mitigate steatosis using ultrasound images, Procedia Technol. 5 (2012) 763–770.
C.-C. Wu, W.-C. Yeh and W.-D. Hsu et al. / Computer Methods and Programs in Biomedicine 170 (2019) 23–29 29

[9] R. Ribeiro, J. Sanches, Fatty liver characterization and classification by ul- [24] M.G. Sanal, Biomarkers in nonalcoholic fatty liver disease-the emperor has no
trasound, in: Iberian Conference on Pattern Recognition and Image Analysis, clothes? World J. Gastroenterol. 21 (2015) 3223.
Springer, 2009, pp. 354–361. [25] L. Castera, V. Vilgrain, P. Angulo, Noninvasive evaluation of NAFLD, Nat. Rev.
[10] M. Owjimehr, H. Danyali, M.S. Helfroush, A. Shakibafard, Staging of fatty liver Gastroenterol. Hepatol. 10 (2013) 666–675.
diseases based on hierarchical classification and feature fusion for back-scan— [26] Z.-w. Chen, L.-y. Chen, H.-l. Dai, J.-h. Chen, L.-z. Fang, Relationship between
converted ultrasound images, Ultrason. Imaging 39 (2017) 79–95. alanine aminotransferase levels and metabolic syndrome in nonalcoholic fatty
[11] G. Li, Y. Luo, W. Deng, X. Xu, A. Liu, E. Song, Computer aided diagnosis of liver disease, J. Zhejiang Univ.-Sci. B 9 (2008) 616–622.
fatty liver ultrasonic images based on support vector machine: engineering in [27] J.M. Clark, A.M. Diehl, Defining nonalcoholic fatty liver disease: implications
medicine and biology society, in: 2008 EMBS 2008 30th Annual International for epidemiologic studies, Gastroenterology 124 (2003) 248–250.
Conference of the IEEE, IEEE, 2008, pp. 4768–4771. [28] H. Ma, C.-f. Xu, Z. Shen, C.-h. Yu, Y.-m. Li, Application of machine learning tech-
[12] L. Breiman, in: Random Forests, 45, Machine learning, 2001, pp. 5–32. niques for clinical predictive modeling: a cross-sectional study on nonalcoholic
[13] M.C. Papadopoulos, P.M. Abel, D. Agranoff, A. Stich, E. Tarelli, B.A. Bell, fatty liver disease in China, BioMed Res. Int. 2018 (2018).
T. Planche, A. Loosemore, S. Saadoun, P. Wilkins, A novel and accurate diag- [29] M.M. Islam, C.C. Wu, T.N. Poly, H.C. Yang, Y.C. Li, Applications of machine learn-
nostic test for human African trypanosomiasis, Lancet 363 (2004) 1358–1363. ing in fatty live disease prediction, in: 40th Medical Informatics in Europe Con-
[14] F. Rosenblatt, The perceptron: a probabilistic model for information storage ference, MIE 2018, IOS Press, 2018, pp. 166–170.
and organization in the brain, Psychol. Rev. 65 (1958) 386. [30] M. Birjandi, S.M.T. Ayatollahi, S. Pourahmad, A.R. Safarpour, Prediction and di-
[15] I. Rish, An empirical study of the naive Bayes classifier: IJCAI 2001 workshop agnosis of non-alcoholic fatty liver disease (NAFLD) and identification of its as-
on empirical methods in artificial intelligence, IBM 3 (2001) 41–46. sociated factors using the classification tree method, Iran. Red Crescent Med. J.
[16] S. Dreiseitl, L. Ohno-Machado, Logistic regression and artificial neural network 18 (2016).
classification models: a methodology review, J. Biomed. Inform. 35 (2002) [31] R. Jamali, A. Arj, M. Razavizade, M.H. Aarabi, Prediction of nonalcoholic fatty
352–359. liver disease via a novel panel of serum adipokines, Medicine 95 (2016).
[17] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The [32] T.F. Yip, A. Ma, V.S. Wong, Y.K. Tse, H.Y. Chan, P.C. Yuen, G.H. Wong, Labo-
WEKA data mining software: an update, ACM SIGKDD Explor. Newslett. 11 ratory parameter-based machine learning model for excluding non-alcoholic
(2009) 10–18. fatty liver disease (NAFLD) in the general population, Aliment. Pharmacol.
[18] R. Kohavi, F. Provost, Glossary of terms, Mach. Learn. 30 (1998) 271–274. Therapeutics 46 (2017) 447–456.
[19] A.K. Loomis, S. Kabadi, D. Preiss, C. Hyde, V. Bonato, M. St. Louis, J. Desai, [33] J. Wu, J. Roy, W.F. Stewart, Prediction modeling using EHR data: challenges,
J.M. Gill, P. Welsh, D. Waterworth, Body mass index and risk of nonalcoholic strategies, and a comparison of machine learning approaches, Med. Care 48
fatty liver disease: two electronic health record prospective studies, J. Clin. En- (2010) S106–S113.
docrinol. Metab. 101 (2016) 945–952. [34] J. Kang, T. Lee, I. Yap, K. Lun, Analysis of cost-effectiveness of different strate-
[20] Q. Pang, J.-Y. Zhang, S.-D. Song, K. Qu, X.-S. Xu, S.-S. Liu, C. Liu, Central obesity gies for hepatocellular carcinoma screening in hepatitis B virus carriers, J. Gas-
and nonalcoholic fatty liver disease risk after adjusting for body mass index, troenterol. Hepatol. 7 (1992) 463–468.
World J. Gastroenterol. 21 (2015) 1650. [35] T. Condie, P. Mineiro, N. Polyzotis, M. Weimer, Machine learning on big data:
[21] Y.-C. Lin, S.-C. Chou, P.-T. Huang, H.-Y. Chiou, Risk factors and predictors of data engineering (ICDE), in: 2013 IEEE 29th International Conference on, IEEE,
non-alcoholic fatty liver disease in Taiwan, Ann. Hepatol. 10 (2011) 125–132. 2013, pp. 1242–1244.
[22] G. Marchesini, S. Avagnina, E. Barantani, A. Ciccarone, F. Corica, E. Dall’Aglio, [36] T.B. Murdoch, A.S. Detsky, The inevitable application of big data to health care,
R. Dalle Grave, P. Morpurgo, F. Tomasi, E. Vitacolonna, Aminotransferase and JAMA 309 (2013) 1351–1352.
gamma-glutamyltranspeptidase levels in obesity are associated with insulin re- [37] G.K. Savova, P.V. Ogren, P.H. Duffy, J.D. Buntrock, C.G. Chute, Mayo clinic NLP
sistance and the metabolic syndrome, J. Endocrinol. Invest. 28 (2005) 333–339. system for patient smoking status identification, J. Am. Med. Inform. Assoc. 15
[23] R.K. Schindhelm, M. Diamant, J.M. Dekker, M.E. Tushuizen, T. Teerlink, (2008) 25–28.
R.J. Heine, Alanine aminotransferase as a marker of non-alcoholic fatty liver [38] G. McLachlan, K.-A. Do, C. Ambroise, Analyzing Microarray Gene Expression
disease in relation to type 2 diabetes mellitus and cardiovascular disease, Dia- Data, John Wiley & Sons, 2005.
betes Metab. Res. Rev. 22 (2006) 437–443. [39] B. Efron, Estimating the error rate of a prediction rule: improvement on cross–
validation, J. Am. Statist. Assoc. 78 (1983) 316–331.

View publication stats

You might also like