An Explanatory Analytics Model For Identifying Factors Indi - 2022 - Decision An

Decision Analytics Journal 3 (2022) 100058
Contents lists available at ScienceDirect
Decision Analytics Journal

journal homepage: www.elsevier.com/locate/dajour
An explanatory analytics model for identifying factors indicative of long-

versus short-term survival after lung transplantation
Mostafa Amini a , Ali Bagheri a , Dursun Delen a,b ,∗
a Department of Management Science and Information Systems, Spears School of Business, Oklahoma State University, Stillwater, OK, USA
b
Faculty of Engineering and Natural Sciences, Istinye University, Istanbul, Turkey
ARTICLE INFO ABSTRACT

Keywords: Due to the shortage of available organs compared to the number of patients on waitlists, the organ allocation
Lung transplantation process has always been challenging and calls for an equitable and optimized allocation system. This system
Long-term survival demands minimizing the waitlist mortality and improving transplantation benefits (e.g., survival time and
Organ allocation
quality of life). According to prior research, lung recipients’ long-term survival time is lower than other solid
Machine learning
organs recipients. This paper proposes and implements an explanatory analytics framework to study the most
Random forest
Shapley additive explanations
prominent factors contributing to long-term survival after lung transplantation. We collect data from the United
Network for Organ Sharing registry database. It contains more than 44 thousand unique patients undergoing
lung transplantation in the US since 1987. In our proposed framework, we first employ several machine
learning (ML) algorithms for classification and choose the best-performing one for further analysis of factors.
Then, we utilize a state-of-the-art explainable artificial intelligence method with the chosen ML algorithm for
the model explanation and interpretation. The proposed framework lists the most critical factors influencing
long-term survival after lung transplantation and their corresponding importance measures. For instance, our
results suggest that Hepatitis B surface antibody (HBV_SURF) and forced expiratory volume in one second
(FEV1) are among the important factors but have not been well examined by lung transplant researchers. Our
framework is also able to provide the main contributing factors for each patient individually. Medical scholars
and practitioners can use our findings for further analysis and improvement of the lung allocation system.
1. Introduction Procurement and Transplantation Network (OPTN), in May 2005, lung

allocation was based on the accumulated waitlist time, regardless of the
Lung transplantation (LTx) is a life-saving therapeutic option for severity of patients’ condition, leading to higher waitlist mortalities and
patients suffering from end-stage pulmonary disease unresponsive to lower survival benefits (wasting the scarce donor organs). Currently,
other medical or surgical treatments. More than seventy thousand LTx lungs are allocated to adult US patients (12 years or older) based on
procedures have been reported to the International Society for Heart age, geographical location, blood type compatibility, LAS score, and
and Lung Transplantation (ISHLT), since its inception, through 260 waiting time (if necessary). LAS score is an adjusted estimate, scaled
lung transplant centers and 184 centers that perform both heart and from 0 to 100, representing each candidate’s survival on the waiting
lung transplants worldwide [1]. The biggest managerial challenge with list (transplant urgency) and his or her 1-year survival if a transplant
LTx is handling the imbalance between the demand for lungs and the is performed (transplant benefit) [3].
number of available ones. In the US, according to Health Resources & Due to the growing worldwide experience of LTx and recent im-
Services Administration (HRSA) [2], in 2020, 3521 candidates were on provements in lung allocation policies, operative techniques, and post-
the waiting list while 2539 organs were available for LTx. operative care of organ recipients, the survival rate of patients under-
The widening gap between the number of people waiting for LTx going LTx has been improving during the past three decades with a
and the number of available organs calls for an equitable and optimized median survival of 4.2 years in the 1990–1998 period compared to
organ allocation to decrease the waiting list mortality while improving 6.1 years in the 1999–2008 period [4]. However, the long-term survival
LTx benefits (e.g., survival time and quality of life). In the US, prior of lung recipients remains relatively low in comparison with other solid
to the implementation of lung allocation score (LAS) by the Organ organ transplants, such as kidney, liver, heart, and pancreas [5], with
∗ Corresponding author.
E-mail addresses: moamini@okstate.edu (M. Amini), ali.bagheri@okstate.edu (A. Bagheri), dursun.delen@okstate.edu, dursun.delen@istinye.edu.tr
(D. Delen).
URL: http://spears.okstate.edu/delen (D. Delen).
https://doi.org/10.1016/j.dajour.2022.100058
Received 5 April 2022; Received in revised form 21 April 2022; Accepted 25 April 2022
Available online 3 May 2022
2772-6622/© 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
M. Amini, A. Bagheri and D. Delen Decision Analytics Journal 3 (2022) 100058
a median survival of 5.8 years and 5- and 10-year survival rates of

54% and 32% [6]. In the US, although LAS implementation has led to
irrefutable advancements in lung allocation and LTx outcomes, critics
argue that the 1-year survival factor in the LAS calculation is not a
reliable marker for lung-term transplant benefits. Accordingly, OPTN
considers continuous refinement of the LAS calculation method as more
LTx procedures are performed, and more outcome data is gathered [7].
To this end, studying a specific group of patients with prolonged
survival will increase the understanding of the most important factors
that influence the long-term LTx benefits. With this objective in mind,
in this research, we use a feature-rich dataset containing factors related
to both the recipient and the donor as well as the transplant procedure
to identify the most important factors contributing to long-term LTx
survival.
2. Background and literature review
Due to the primary focus of OPTN on 1-year survival in calculating

LAS, the majority of prior research on post LTx survival concentrated
on investigating risk factors affecting short-term survival (<= 1 year)
and, hence, variables associated with long-term survival (>= 10 years)
are not well explored. The previous studies of long-term LTx survival
analysis have mostly utilized descriptive analytics [8] and traditional
statistical methods of survival analysis, such as Kaplan–Meier [9] and
Cox proportional hazards models [10], to estimate long-term survival
rates and to explain the significance of variables in improving survival.
These studies have been limited to a specific cohort of deceased pa-
tients [11], transplant centers [12], post LTx health condition [13],
or transplant type [14]. Few studies employed other techniques as
well as more comprehensive and inclusive datasets. Weiss et al. [15] Fig. 1. Proposed analytic framework.
utilized Logistic Regression (LR) to identify factors differentiating long-
term versus intermediate-term LTx survival (defined as greater than
10 years and 1 to 5 years, respectively). Jawitz et al. [16] used LR to investigate the most important variables influencing the long-term
model and sensitivity analysis to compare factors associated with LTx survival of patients undergoing LTx. Due to the recent critiques about
survival greater than 10 years. However, common shortcomings of the black-box nature of ML methods and their lack of interpretability,
the above-mentioned research are: (1) the number of variables under we take advantage of a novel XAI method to explain the importance of
study is limited, (2) variables are selected based on the experiences risk factors. The rest of this paper is organized as follows. In Section 3,
and intuitions of the analysts, and (3) traditional statistical approaches we describe the data and methodology. In Section 4, the results are
make specific assumptions on the distribution of the variables as well represented and discussed. Finally, we conclude this paper in Section 5.
as the linearity of models, which affect their prediction power and
interpretability [17]. 3. Materials and methods
As successful alternatives, machine learning (ML) techniques have
attracted more and more attention from scholars in the healthcare In this paper, we adopted a hybrid analytic approach consisting of
analytics field during the past decade [18]. These techniques, without a predictive phase and an explanation phase to discover significant
any prior knowledge or assumption on data, can handle much larger factors participating in patients’ survival after LTx. We employed the
sets of variables and provide even more accurate predictions compared prediction power of ML models for classification in the first phase.
to classical methods. Despite a growing trend in ML applications in Then, we leveraged the most advanced explainable AI technique, the
organ transplant survival analysis [19,20], these studies are rare for SHapley Additive exPlanations (SHAP) algorithm, to explain the best
LTx survival [21]. Oztekin et al. [22] studied patients who survived predictions from the first phase. Fig. 1 represents the framework we
9 years after combined lung–heart transplantation. They proposed an adopted for this paper.
integrated data mining methodology including ML and Cox hazard
models to investigate factors contributing to the long-term survival 3.1. Data acquisition and preparation
rate. In another study, Delen et al. [23] used ML models and sensitivity
analysis for variable selection and then developed a Cox model for Following the formal data requisition procedures, we accessed the
lung–heart transplant survival. Additionally, using a k-mean clustering data from United Network for Organ Sharing (UNOS), a ‘‘mission-
algorithm, they identified three risk groups of patients for further in- driven non-profit serving as the nation’s transplant system under con-
depth analysis. However, to the best of our knowledge, there is no tract with the federal government’’ [29]. The datasets included in-
applied ML study investigating solely long-term survivability after LTx. formation on all waiting list registrations and transplants, listed or
Besides, [22,23] lack utilizing state-of-the-art methods for model expla- performed in the US from 1987 until July 2021 (our request date)
nation and interpretation. Explainable artificial intelligence (XAI), as a from deceased and living donor transplants. The lung-related data
rapidly developing field of research, has recently shown considerable included 44,931 records of unique patients that had gone through
advances in understanding and interpreting black-box ML models [24] a lung transplant, each with 512 variables. These variables included
and has increasingly been used in the healthcare analytics field [25,26], clinical and demographic factors related to donors, recipients, and the
but rarely in organ transplantation research [27,28]. transplant procedure. The patients and hospitals had been de-identified
Considering the mentioned gap in the literature, in this paper, we for research purposes and to address privacy and security issues. Fig. 2
propose an exploratory analytics framework including ML methods represents the demographic information of the dataset we obtained.
2
Fig. 2. Demographic information of the data (age, ethnicity, gender).
The primary objective of this study was to leverage predictive 3.2. Prediction models
models and explainable AI techniques to discover the impactful factors
for a patient’s lung transplant’s prolonged survival. We recognized two In the first phase of our analysis, we were to build the most accurate
post-operative variables that could be used towards finding candidate model for classifying patients with respect to their survival time after
outcomes. These variables are pstatus, a binary variable that represents the transplant. The model would then be used as an input for the
whether the patient has survived at the last follow-up time (0 if living), explanation phase.
and ptime, a continuous variable, that denotes the time frame from To select the predictive models, we speculated that tree-based mod-
transplant until last follow-up or death. See Fig. 2 for ptime distribution els, such as gradient boosted trees (GBT) and Random Forests (RF),
in the dataset for lung transplants when the patient is dead (pstatus = would outperform others because most of the variables were either
1). categorical or binary. However, former studies proved the strength
We excluded organ transplantations of non-adult patients (less than of number-based models such as logistic regression (LR), K nearest
18 years of age) and the patients whose death was not related to the neighbors (KNN), support vector machines (SVM), and artificial neural
lung (using the variable COD, defined as the cause of death). Following networks (ANN). As a result, we tried models from both groups. We
UNOS recommendations, to avoid overestimating patients with short also included a decision tree (DT) to evaluate and foolproof the results
survival time, we also cut off the records with transplant dates within of tree-based models. We found no fallible split in trees or any sign of
14 months of the data acquisition. Next, we removed patients with cheating variables.
missing ptime because we could not determine their survival time. To enhance the predictive performance measures, we fine-tuned
the hyperparameters of the models. For example, in the case of RF
Abiding by the cleaning steps in [30], we then omitted the intra-
(our winning model), we tried numerous split criteria (information
and postoperative factors since our research was to address the factors
gain, information gain ratio, and Gini index) for the decision trees. We
before the operation. Next, we cleaned all invalid and duplicated
increased the number of trees gradually to achieve the best performing
records. We eliminated variables that would not have any predictive
configuration.
power (e.g., IDs and codes) and variables with little change in the entire
In addition, to use the data efficiently and prevent our models
dataset (e.g., constants for more than 95% of the data). To address
from overfitting the training dataset or choosing a specific test set, we
missing values, we first distinguished various missing patterns in the
conducted k-fold cross-validation (with 𝑘 = 10) for all the predictive
dataset using the variable definitions and handled them accordingly
models. This way, we improved the generalization capabilities of the
via removal, imputation, or recoding. We ended up having 37,580 models.
patient records and 171 variables in the dataset. Unlike previous stud-
ies [22,31,32] we removed only variables with many missing (more
3.3. Explanation models
than 70%), thereby providing a more comprehensive approach while
using state-of-the-art ML and explainable AI techniques.
In general, the need for model explanation is two-fold. First, we
From Fig. 3, 27% of patients survive one year or less, whereas 14% need to comprehend what variables are used for classification judgment
survive ten years or more. The data is derived based on the reported in the black box of advanced algorithms to certify their legitimacy. A
deaths. However, the number of patients that outlive ten years needs machine may solely learn the characteristics of the dataset and produce
to be adjusted, whereas living patients should also be considered in accurate predictions that have nothing to do with desired patterns
this category. This approach has been previously used in [25]. This in the data. In this case, the explanation phase tells us whether the
way, we carried about a quarter of the data in the long survival variables used are both meaningful and relevant. Second, to be effective
category (i.e., >= 10 years). Fig. 4 shows the frequency and cumulative for practitioners and end-users, we need an explanation model to return
percentage of the adjusted long survival category compared with other the most prominent variables and their importance, especially when
categories. For identifying factors indicative of long- versus short-term working with a feature-rich dataset [33]. So, in the second phase of
survival after LTx, we finally chose two categories of short term (<= our study, we were required to explain the predictions drawn from the
1 year) and long term (>= 10 years) for our classification. These time previous step.
thresholds were commonly suggested by LTx scholars for short- and Recently, many scholars in data analytics research have dedicated
long-term survival [15,16]. These categories involved 13,080 patients their research to the explanation aspect. Consequently, a variety of
(9,864 records after cleaning). approaches, such as LIME [33], DeepLIFT [34], Layer-Wise Relevance
3
Fig. 3. Yearly ptime distribution after LTx (pstatus = 1).
Fig. 4. Yearly ptime distribution after LTx with adjusted >= 10 category.
Propagation [35], Shapley regression values [36], Shapley sampling In this phase of our study, we employed SHAP to shed light on the
values [37], and Quantitative Input Influence [38] have been developed black box of our best performing predictive model and interpret which
(for detailed discussion and review on explanation methods see [24, factors play a critical role in prediction.
39]).
In this vein, Lundberg and Lee [40] proposed a unified framework 4. Results
for interpreting black-box predictions by assigning each feature an
importance value, called the ‘SHapley Additive exPlanations’ (SHAP) Our analysis included two phases to understand the critical fac-
value. SHAP values are defined as the sequential impact of each fea- tors of short- versus long-term survival in LTx, i.e., predictive models
ture’s value on the model’s output, averaged over all possible orders by and model explanation. We explain the results of these phases in the
which features are added to the model. Due to the complexity of the following sections.
problem, SHAP values can only be approximated. However, Lundberg
et al. [41] suggested computing precise values for a prediction when 4.1. Predictive models
working with tree-based machine learning algorithms for instance.
Given an instance x and the feature set 𝐹 , SHAP algorithm computes The objective of this phase was to discover a model with the
the marginal contribution of the feature 𝑖 in the performance of the most robust predictions. As we mentioned before, due to the superior
provided prediction model compared to all subsets of 𝐹 conditioning capability of tree-based models in handling categorical variables, we
on (with or without) feature 𝑖 (Eq. (1)). surmised that such models would be a better fit. However, the literature
∑ 𝑃𝑆 (𝑥) − 𝑃𝑆−𝑖 (𝑥) brought a few number-based methods to our consideration. We tuned
𝑆𝐻𝐴𝑃𝑖 (𝑥) = ( ) , (1) the hyperparameters and tested models in both categories to keep
𝑆⊆𝐹 ,𝑖∈𝑆 |𝐹 |
|𝑆| an open mind and approach. Table 1 represents the performance of
|𝑆|
these algorithms. Although LR, among the number-based algorithms,
where, 𝑆 is any subset of features, 𝑃𝑆 (𝑥) denotes predicted outcome by has comparable performance to tree-based models, we still can see the
a model trained with subset 𝑆 for instance 𝑥, and | ⋅ | indicates the size superiority of tree-based models, most significantly RF.
of a set. We compared four evaluation metrics of performance to conclude
The strength of these SHAP values is that they are locally accurate, our best model. These metrics are accuracy, sensitivity, specificity, and
that is, for each instance, prediction is an aggregation of an additive set area under ROC curve (AUC). A model must excel in all these metrics in
of marginal contributions. In our study, the short versus long survival classification problems, where only one decent metric (e.g., specificity
of each patient with a LTx is collectively determined by a set of of SVM) cannot be a determining criterion. It is evident from the results
values for each feature that helps us understand that instance prediction that RF is the best model where it outperforms all others in almost all
accurately. metrics.
4
Table 1
Predictive models’ performance.
Fig. 5. Top twenty features ranking and importance.
4.2. Model explanation survival. Doricic et al. [47] used serum creatinine (END_CREAT) and
other clinical factors to predict the long-term mortality after LTx. Singer
We utilized the SHAP algorithm [41] to achieve the explanation et al. [48] investigated the impact of pretransplant ventilation support
phase objectives in the final step. First, to ensure that the predictive (VENT_SUPPORT_AFTER_LIST) on short- and long-term survival after
models relied on meaningful and relevant patterns and features; next, LTx. Jawitz et al. [16] found donor age (AGE_DON) and total bilirubin
to investigate the most relevant factors among the selected features and (TBILI) as differentiators between short- and long-term survival after
their criticality level for long-term survival. Given the promising result LTx. Yu et al. [14], by a metanalysis of prior research, studied the trans-
of RF from the previous phase (between 75%–80% for all the evaluation plant type (TX_TYPE) and concluded that bilateral LTx is associated
metrics), we applied RF as the base model in the SHAP algorithm. We with longer survival time.
calculated SHAP values in each fold in a 10-fold cross-validation setting This existing literature, among others, demonstrates the effective-
with RF. Then, we took the average of SHAP values for each variable.
ness of our holistic and advanced study of the donor, recipient, and
Fig. 5 presents the ranking of the top twenty features and their
LTx procedure characteristics. Our findings can shed light on other im-
averaged importance. Table 2 shows the top selected variables by SHAP
portant predictors that have not been investigated by medical scholars
and their definitions. We do not see any irrelevant data or judgment by
for long-term LTx survival. For instance, Dhillon et al. [49] studied the
the predictive models from this table. Next, we explain the variables in
impact of Hepatitis B surface antibody (HBV_SURF_TOTAL) on one-year
Table 2.
thoracic transplant survival. Doershuk et al. [50] evaluated the rela-
Some of the variables in Table 2 have already been studied in-
tionship between forced expiratory volume in one second (FEV1_TRR)
dividually or with a few other factors using traditional methods in
prior research and shown to be significant predictors for long-term LTx and LTx timing and short-term survival.
survival. Mabilangan et al. [42] and Kurihara et al. [43] evaluated the Notice here that we omitted the LAS score from our variable’s list
impact of Cytomegalovirus (CMV_IGG) infection status of lung donors so that it does not affect the variable importance measures. According
and recipients on long-term survival after LTx and found it a significant to the LAS calculation method [29], some variables used in LAS calcu-
risk factor. Lehr et al. [44] showed that the extremes of age (AGE) lation are in our top 20 variable list. The presence of variables such as
decrease the long-term survival in adults after LTx. Kanask et al. [45] AGE, WGT_KG_CALC, TBILI, FUNC_STAT_TRR, END_CREAT, BMI_CALC
indicated that organ recipients who were obese (WGT_KG_CALC) had a in our ranking list is an additional demonstration of the effectiveness
shorter survival time after LTx. Banga et al. [46] found serum albumin of our exploratory analysis. Therefore, our findings can be utilized as a
(TOT_SERUM_ALBUM) as a predictor for prolonged hospital length preliminary study for advanced medical research to improve LAS score
of stay after LTx, which is shown to be associated with long-term calculation and organ allocation continuously.
5
Table 2 transplant, and other characteristics. With the advantage of the SHAP
List of top 20 features and their definitions.
algorithm, our proposed approach can picture how these features can
Variable Definition impact the longevity of this patient. Fig. 6 illustrates contributing
CMV_IGG Recipient CMV by IDD test result at transplant features that push the model output from the base value (the average
HBV_SURF_TOTAL Recipient HBV surface antibody total at transplant model output over the training dataset) to the ‘‘high’’ survival time
AGE Recipient age (year) or ‘‘low’’. Features driving the survival time prediction to ‘‘high’’ are
WGT_KG_CALC Calculated recipient weight (kg) shown in red; those pushing the prediction to ‘‘low’’ are blue.
MED_COND_TRR Recipient medical condition pre-transplant at In this example, for a specific patient, in addition to the model
transplant output (short or long survival), the generated SHAP values clarify
BUN_DON Deceased donor terminal blood urea nitrogen which factors are prominent and what their impact is on the model
TOT_SERUM_ALBUM Patient total serum Albumin at registration output, thereby increasing the trust in the model. Also, SHAP values
allow the physicians to reevaluate the situation for each patient. In the
FUNC_STAT_TRR Recipient functional status at transplant
case of short-time survivability (our example), the transplant should be
DAYSWAIT_CHRON Total days on waiting list
reconsidered, or the conditions should be compared to other patients on
VENT_SUPPORT_AFTER_LIST Events occurring between listing and transplant:
the waitlist. Although, in general, we suggest organ allocation to those
episode of ventilatory support
patients with high survivability, SHAP values for each patient can help
FEV1_TRR Pulmonary status: FEV1% predicted at transplant
compare patients even in the same class.
END_CREAT Serum Creatinine at
Another benefit of SHAP values for each individual is the ability
transplant/offer/removal/current time (HL, LU
only) to provide patients with recommendations. For example, if a patient’s
AGE_DON Donor age (year)
weight is striking his survivability, the doctor can give practical advice
or postpone the transplant. Finally, in the case of any discrepancies
BMI_CALC Calculated candidate BMI
or underlying conditions, the doctors can reject or adjust the model’s
BRONCHO_RT_DON DDR right lung bronchoscopy
output by looking at the individual SHAP values.
PROTEIN_URINE Deceased donor protein in urine
TX_TYPE Type of transplant 5. Summary and conclusion
WGT_KG_DON_CALC Calculated donor weight (kg)
TBILI Most recent serum total Bilirubin at transplant In this paper, we studied factors indicative of patients’ long- versus
TX_PROCEDUR_TY Recipient procedure type short-term survival after LTx by proposing an explanatory analytics
framework. We collected a comprehensive dataset of LTx operations
from the United Network for Organ Sharing (UNOS) registry database,
including recipient, donor, and LTx procedure characteristics. Based
on the survival time of lung recipients, we defined two categories of
patients representing short- and long-term survival after LTx. After
going through a comprehensive data cleaning and preprocessing in the
proposed framework, we trained and tested various ML algorithms to
find the best predictive one for our binary classification problem and
preprocessed data in hand. We found RF as the one with the best
predictive performance and chose it for further analysis. In the next
step, to interpret and explain the predictions, we employed a novel
explainable AI method accompanied by RF to understand the most
critical factors affecting the long-term survival after LTx. As a result,
our proposed framework provided a list of the top 20 important factors
and their corresponding importance measures. Policymakers can use
our findings in improving the lung allocation system. In addition, our
framework can predict and explain the individual long-term survival
of LTx candidates and can be utilized as a recommendation system for
organ matching and allocation.
One limitation of this study is that the data is limited to only US
Fig. 6. SHAP values for an individual patient. LTx patients. Although we believe that our data set is large enough to
obtain robust results, our framework and results could be validated by
scholars who have access to larger organ transplant registries’ data like
ISHLT.
We also did not have access to the patients’ medical history and
4.3. Individual explanation
underlying condition before the transplant. This variable and others
such as gender and race could be controlled for in future studies.
In healthcare analytics, it is crucial to investigate the process of
any decision support system for an individual patient. This way, the Moreover, each of the top variables in Table 2 calls for a more focused
performance of advanced algorithms can help physicians by providing investigation through statistical and causal models.
the main contributing factors for each patient. In addition, physicians Finally, from a broader perspective, LTx has brought patients hope
can enforce necessary adjustments based on the patient’s underlying for prolonged survival and enhanced health-related quality of life
conditions. SHAP algorithm is an appropriate tool in this regard. On (HRQoL). Although these hopes come true for many patients, some
the one hand, it provides a holistic picture of the major risk factors die earlier than they would have without LTx and a considerable
and their relative importance (Fig. 5). On the other hand, it delivers an proportion of lung recipients face comorbidities and adverse health
individual-level analysis of the features and their contribution to each events. Hence, survival analysis alone cannot provide an adequate
patient’s probability of an outcome. measure of the level of success or failure of LTx, and studying the
Consider a patient in his fifties with positive HBV and CMV, who HRQoL of lung recipients is an essential task. Previous studies on
is above the average weight, has poor functionality at the time of HRQoL after LTx often relied on survey data from living and well
6
enough lung recipients. The major limitation of these studies is that the [17] L. Breiman, Statistical modeling: The two cultures, Stat. Sci. 16 (2001) 199–215,
data are not missing at random [51]. Thus, one interesting direction of http://dx.doi.org/10.1214/ss/1009213726.
research is assessing the contributing factors to HRQoL after LTx using [18] K.H. Yu, A.L. Beam, I.S. Kohane, Artificial intelligence in healthcare, Nat.
Biomed. Eng. 2 (2018) 719–731, http://dx.doi.org/10.1038/s41551-018-0305-z.
transplant registries’ follow-up data and ML/AL methods.
[19] C. Díez-Sanmartín, A.S. Cabezuelo, Application of artificial intelligence tech-
niques to predict survival in kidney transplantation: A review, J. Clin. Med.
Declaration of competing interest 9 (2020) http://dx.doi.org/10.3390/jcm9020572.
[20] L.R. Wingfield, C. Ceresa, S. Thorogood, J. Fleuriot, S. Knight, Using artificial
The authors declare that they have no known competing finan- intelligence for predicting survival of individual grafts in liver transplantation:
cial interests or personal relationships that could have appeared to A systematic review, Liver Transpl. 26 (2020) 922–934, http://dx.doi.org/10.
1002/lt.25772.
influence the work reported in this paper.
[21] L. Shahmoradi, H. Abtahi, S. Amini, M. Gholamzadeh, Systematic review of using
medical informatics in lung transplantation studies, Int. J. Med. Inf. 136 (2020)
References http://dx.doi.org/10.1016/j.ijmedinf.2020.104096.
[22] A. Oztekin, D. Delen, Z. (James) Kong, Predicting the graft survival for heart-
[1] D.C. Chambers, A. Zuckermann, W.S. Cherikh, M.O. Harhay, D. Hayes, E. Hsich, lung transplantation patients: An integrated data mining methodology, Int. J.
K.K. Khush, L. Potena, A. Sadavarte, T.P. Singh, J. Stehlik, The international Med. Inf. 78 (2009) http://dx.doi.org/10.1016/j.ijmedinf.2009.04.007.
thoracic organ transplant registry of the international society for heart and [23] D. Delen, A. Oztekin, Z. Kong, A machine learning-based approach to prognostic
lung transplantation: 37th adult lung transplantation report — 2020; Focus on analysis of thoracic transplantations, Artif. Intell. Med. 49 (2010) 33–42, http:
deceased donor characteristics, J. Heart Lung Transpl. 39 (2020) 1016–1027, //dx.doi.org/10.1016/j.artmed.2010.01.002.
http://dx.doi.org/10.1016/j.healun.2020.07.009.
[24] L.H. Gilpin, D. Bau, B.Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining expla-
[2] Organ Donation Statistics | organdonor.gov, https://www.organdonor.gov/learn/
nations: An overview of interpretability of machine learning, in: Proceedings -
organ-donation-statistics. (Accessed 3 April 2022).
2018 IEEE 5th International Conference on Data Science and Advanced Analytics,
[3] T.M. Egan, S. Murray, R.T. Bustami, T.H. Shearon, K.P. McCullough, L.B.
DSAA 2018, 2019, pp. 80–89, http://dx.doi.org/10.1109/DSAA.2018.00018.
Edwards, M.A. Coke, E.R. Garrity, S.C. Sweet, D.A. Heiney, F.L. Grover, Develop-
ment of the new lung allocation system in the United States, Am. J. Transplant. [25] G. Stiglic, P. Kocbek, N. Fijacko, M. Zitnik, K. Verbert, L. Cilar, Interpretability of
6 (2006) 1212–1227, http://dx.doi.org/10.1111/j.1600-6143.2006.01276.x. machine learning-based prediction models in healthcare, Wiley Interdiscip. Rev.:
[4] R.D. Yusen, L.B. Edwards, A.Y. Kucheryavaya, C. Benden, A.I. Dipchand, S.B. Data Min. Knowl. Discov. 10 (2020) http://dx.doi.org/10.1002/widm.1379.
Goldfarb, B.J. Levvey, L.H. Lund, B. Meiser, J.W. Rossano, J. Stehlik, The [26] R. ElShawi, Y. Sherif, M. Al-Mallah, S. Sakr, Interpretability in healthcare: A
registry of the international society for heart and lung transplantation: Thirty- comparative study of local machine learning interpretability techniques, Comput.
second official adult lung and heart-lung transplantation report - 2015; Focus Intell. 37 (2021) 1633–1650, http://dx.doi.org/10.1111/coin.12410.
theme: Early graft failure, J. Heart Lung Transpl. 34 (2015) 1264–1277, http: [27] M.O. Killian, S.N. Payrovnaziri, D. Gupta, D. Desai, Z. He, Machine learning–
//dx.doi.org/10.1016/j.healun.2015.08.014. based prediction of health outcomes in pediatric organ transplantation recipients,
[5] A. Rana, A. Gruessner, V.G. Agopian, Z. Khalpey, I.B. Riaz, B. Kaplan, K.J. Ha- JAMIA Open 4 (2021) http://dx.doi.org/10.1093/jamiaopen/ooab008.
lazun, R.W. Busuttil, R.W.G. Gruessner, Survival benefit of solid-organ transplant [28] Y. Zhou, S. Chen, Z. Rao, D. Yang, X. Liu, N. Dong, F. Li, Prediction of 1-
in the United States, JAMA Surg. 150 (2015) 252–259, http://dx.doi.org/10.
year mortality after heart transplantation using machine learning approaches:
1001/jamasurg.2014.2038.
A single-center study from China, Int. J. Cardiol. 339 (2021) 21–27, http:
[6] R.D. Yusen, L.B. Edwards, A.I. Dipchand, S.B. Goldfarb, A.Y. Kucheryavaya,
//dx.doi.org/10.1016/j.ijcard.2021.07.024.
B.J. Levvey, L.H. Lund, B. Meiser, J.W. Rossano, J. Stehlik, The registry of
[29] What is UNOS? | About United Network for Organ Sharing, https://unos.org/
the international society for heart and lung transplantation: Thirty-third adult
lung and heart–lung transplant report—2016; Focus theme: Primary diagnostic about/. (Accessed 11 March 2022).
indications for transplant, J. Heart Lung Transpl. 35 (2016) 1170–1184, http: [30] A. Dag, K. Topuz, A. Oztekin, S. Bulur, F.M. Megahed, A probabilistic data-driven
//dx.doi.org/10.1016/j.healun.2016.09.001. framework for scoring the preoperative recipient-donor heart transplant survival,
[7] M. Valapour, C.J. Lehr, M.A. Skeans, J.M. Smith, E. Miller, R. Goff, J. Foutz, Decis. Support Syst. 86 (2016) 1–12, http://dx.doi.org/10.1016/J.DSS.2016.02.
A.K. Israni, J.J. Snyder, B.L. Kasiske, OPTN/SRTR 2019 annual data report: Lung, 007.
Am. J. Transpl. 21 (2021) 441–520, http://dx.doi.org/10.1111/ajt.16495. [31] A. Dag, A. Oztekin, A. Yucel, S. Bulur, F.M. Megahed, Predicting heart transplan-
[8] S. Sithamparanathan, L. Thirugnanasothy, S. Clark, J.H. Dark, A.J. Fisher, K.F. tation outcomes through data analytics, Decis. Support Syst. 94 (2017) 42–52,
Gould, A. Hasan, J.L. Lordan, G. Meachery, G. Parry, P.A. Corris, Observational http://dx.doi.org/10.1016/J.DSS.2016.10.005.
study of lung transplant recipients surviving 20 years, Respir. Med. 117 (2016) [32] A. Oztekin, L. Al-Ebbini, Z. Sevkli, D. Delen, A decision analytic approach
103–108, http://dx.doi.org/10.1016/j.rmed.2016.06.008. to predicting quality of life for lung transplant recipients: A hybrid genetic
[9] E.L. Kaplan, P. Meier, Nonparametric estimation from incomplete observations, J. algorithms-based methodology, European J. Oper. Res. 266 (2018) 639–651,
Amer. Statist. Assoc. 53 (1958) 457–481, http://dx.doi.org/10.1080/01621459. http://dx.doi.org/10.1016/J.EJOR.2017.09.034.
1958.10501452.
[33] M.T. Ribeiro, S. Singh, C. Guestrin, Why should i trust you? Explaining the
[10] D.R. Cox, Regression models and life-tables, J. R. Stat. Soc. Ser. B Stat. Methodol.
predictions of any classifier, in: Proceedings of the ACM SIGKDD International
34 (1972) 187–202, http://dx.doi.org/10.1111/j.2517-6161.1972.tb00899.x.
Conference on Knowledge Discovery and Data Mining, 13-17-August-2016, 2016,
[11] A.L. Stephenson, J. Sykes, Y. Berthiaume, L.G. Singer, C. Chaparro, S.D. Aaron,
pp. 1135–1144, http://dx.doi.org/10.1145/2939672.2939778.
G.A. Whitmore, S. Stanojevic, A clinical tool to calculate post-transplant survival
using pre-transplant clinical characteristics in adults with cystic fibrosis, Clin. [34] A.K. Avanti Shrikumar, Peyton Greenside, Learning important features through
Transpl. 31 (2017) http://dx.doi.org/10.1111/ctr.12950. propagating activation differences, in: Proceedings of the 34th International
[12] G. Loor, R. Brown, R.F. Kelly, K.D. Rudser, S.J. Shumway, I. Cich, C.T. Holley, Conference on Machine Learning, 2017, pp. 3145–3153.
C. Quinlan, M.I. Hertz, Gender differences in long-term survival post-transplant: [35] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.R. Müller, W. Samek, On
A single-institution analysis in the lung allocation score era, Clin. Transpl. 31 pixel-wise explanations for non-linear classifier decisions by layer-wise rele-
(2017) http://dx.doi.org/10.1111/ctr.12889. vance propagation, PLoS One 10 (2015) e0130140, http://dx.doi.org/10.1371/
[13] E.G. Chan, V. Bianco, T. Richards, J.W.A. Hayanga, M. Morrell, N. Shigemura, M. JOURNAL.PONE.0130140.
Crespo, J. Pilewski, J. Luketich, J. D’Cunha, The ripple effect of a complication in [36] S. Lipovetsky, M. Conklin, Analysis of regression in game theory approach, Appl.
lung transplantation: Evidence for increased long-term survival risk, J. Thorac. Stoch. Models Bus. Ind. 17 (2001) 319–330, http://dx.doi.org/10.1002/asmb.
Cardiovasc. Surg., Mosby Inc. (2016) 1171–1180, http://dx.doi.org/10.1016/j. 446.
jtcvs.2015.11.058.
[37] E. Štrumbelj, I. Kononenko, Explaining prediction models and individual pre-
[14] H. Yu, T. Bian, Z. Yu, Y. Wei, J. Xu, J. Zhu, W. Zhang, Bilateral lung trans-
dictions with feature contributions, Knowl. Inf. Syst. 41 (2014) 647–665, http:
plantation provides better long-term survival and pulmonary function than single
//dx.doi.org/10.1007/s10115-013-0679-x.
lung transplantation: A systematic review and meta-analysis, Transplantation 103
[38] A. Datta, S. Sen, Y. Zick, Algorithmic transparency via quantitative input
(2019) 2634–2644, http://dx.doi.org/10.1097/TP.0000000000002841.
influence: Theory and experiments with learning systems, in: Proceedings - 2016
[15] E.S. Weiss, J.G. Allen, C.A. Merlo, J.V. Conte, A.S. Shah, Factors indicative of
long-term survival after lung transplantation: A review of 836 10-year survivors, IEEE Symposium on Security and Privacy, SP 2016, Institute of Electrical and
J. Heart Lung Transpl. 29 (2010) 240–246, http://dx.doi.org/10.1016/j.healun. Electronics Engineers Inc., 2016, pp. 598–617, http://dx.doi.org/10.1109/SP.
2009.06.027. 2016.42.
[16] O.K. Jawitz, V. Raman, D. Becerra, J. Klapper, M.G. Hartwig, Factors associated [39] D.V. Carvalho, E.M. Pereira, J.S. Cardoso, Machine learning interpretability: A
with short- versus long-term survival after lung transplant, J. Thorac. Cardiovasc. survey on methods and metrics, Electronics 8 (2019) 832, http://dx.doi.org/10.
Surg. (2020) http://dx.doi.org/10.1016/j.jtcvs.2020.09.097. 3390/ELECTRONICS8080832, (2019), 8, Page 832.
7
[40] S.M. Lundberg, S.I. Lee, A unified approach to interpreting model predictions, [46] A. Banga, M. Mohanka, J. Mullins, S. Bollineni, V. Kaza, S. Ring, P. Bajona,
in: Advances in Neural Information Processing Systems, Neural Information M. Peltz, M. Wait, F. Torres, Hospital length of stay after lung transplantation:
Processing Systems Foundation, 2017, pp. 4766–4775. Independent predictors and association with early and late survival, J. Heart
[41] S.M. Lundberg, G. Erion, H. Chen, A. DeGrave, J.M. Prutkin, B. Nair, R. Lung Transpl. 36 (2017) 289–296, http://dx.doi.org/10.1016/j.healun.2016.07.
Katz, J. Himmelfarb, N. Bansal, S.-I. Lee, From local explanations to global 020.
understanding with explainable AI for trees, Nat. Mach. Intell. 2 (2020) 56–67, [47] J. Doricic, R. Greite, V. Vijayan, S. Immenschuh, A. Leffler, F. Ius, A. Haverich,
http://dx.doi.org/10.1038/s42256-019-0138-9. J. Gottlieb, H. Haller, I. Scheffner, W. Gwinner, Kidney injury after lung
[42] C. Mabilangan, J. Preiksaitis, C. Cervera, D. Freed, J. Nagendran, J. Mullen, R. transplantation: Long-term mortality predicted by post-operative day-7 serum
MacArthur, S. Meyer, B. Laing, D. Modry, A. Koshal, D. Lien, J. Weinkauf, K. creatinine and few clinical factors, PLoS One 17 (2022) e0265002, http://dx.
doi.org/10.1371/journal.pone.0265002.
Halloran, A. Kapasi, M. Thakrar, M. Fenton, D. Kim, W. Tymchak, J. Burton,
[48] J.P. Singer, P.D. Blanc, C. Hoopes, J.A. Golden, J.L. Koff, L.E. Leard, S. Cheng, H.
L. Lalonde, M. Chan, W. Wamica, D. Issacs, Impact of donor and recipient
Chen, The impact of pretransplant mechanical ventilation on short- and long-term
cytomegalovirus serology on long-term survival of lung transplant recipients,
survival after lung transplantation, Am. J. Transplant. 11 (2011) 2197–2204,
Transpl. Infect. Dis. 20 (2018) http://dx.doi.org/10.1111/tid.12964.
http://dx.doi.org/10.1111/j.1600-6143.2011.03684.x.
[43] C. Kurihara, R. Fernandez, N. Safaeinili, M. Akbarpour, Q. Wu, G.R.S. Budinger,
[49] G.S. Dhillon, J. Levitt, H. Mallidi, V.G. Valentine, M.R. Gupta, R. Sista, D.
A. Bharat, Long-term impact of cytomegalovirus serologic status on lung trans- Weill, Impact of hepatitis B core antibody positive donors in lung and heart-
plantation in the United States, Ann. Thoracic Surg. 107 (2019) 1046–1052, lung transplantation: An analysis of the united network for organ sharing
http://dx.doi.org/10.1016/j.athoracsur.2018.10.034. database, Transplantation 88 (2009) 842–846, http://dx.doi.org/10.1097/TP.
[44] C.J. Lehr, E.H. Blackstone, K.R. McCurry, L. Thuita, W.M. Tsuang, M. Valapour, 0B013E3181B4E1FD.
Extremes of age decrease survival in adults after lung transplant, in: Chest, [50] C.F. Doershuk, R.C. Stern, Timing of referral for lung transplantation for cystic
Elsevier Inc., 2020, pp. 907–915, http://dx.doi.org/10.1016/j.chest.2019.06.042. fibrosis: Overemphasis on FEV1 may adversely affect overall survival, Chest 115
[45] W.F. Kanasky, S.D. Anton, J.R. Rodrigue, M.G. Perri, T. Szwed, M.A. Baz, Impact (1999) 782–787, http://dx.doi.org/10.1378/chest.115.3.782.
of body weight on long-term survival after lung transplantation, Chest 121 (2002) [51] G. Thabut, H. Mal, Outcomes after lung transplantation, J. Thoracic Dis. 9 (2017)
401–406, http://dx.doi.org/10.1378/chest.121.2.401. 2684–2691, http://dx.doi.org/10.21037/jtd.2017.07.85.

An Explanatory Analytics Model For Identifying Factors Indi - 2022 - Decision An

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Explanatory Analytics Model For Identifying Factors Indi - 2022 - Decision An

Uploaded by

Copyright:

Available Formats

Decision Analytics Journal 3 (2022) 100058

Contents lists available at ScienceDirect

Decision Analytics Journal

An explanatory analytics model for identifying factors indicative of long-

ARTICLE INFO ABSTRACT

1. Introduction Procurement and Transplantation Network (OPTN), in May 2005, lung

a median survival of 5.8 years and 5- and 10-year survival rates of

2. Background and literature review

Due to the primary focus of OPTN on 1-year survival in calculating

Fig. 2. Demographic information of the data (age, ethnicity, gender).

Fig. 3. Yearly ptime distribution after LTx (pstatus = 1).

Fig. 5. Top twenty features ranking and importance.

You might also like