You are on page 1of 8

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO.

3, MAY 2019 1243

Novel Data Imputation for Multiple Types of


Missing Data in Intensive Care Units
Janani Venugopalan , Student Member, IEEE, Nikhil Chanani, Kevin Maher,
and May D. Wang , Senior Member, IEEE

Abstract—The diversity and number of parameters mon- TABLE I


itored in an intensive care unit (ICU) make the resulting MISSING DATA TYPES DATA DICTIONARY
databases highly susceptible to quality issues, such as
missing information and erroneous data entry, which ad-
versely affect the downstream processing and predictive
modeling. Missing data interpolation and imputation tech-
niques, such as multiple imputation, expectation maximiza-
tion, and hot-deck imputation techniques do not account
for the type of missing data, which can lead to bias. In
our study, we first model the missing data as three types:
“neglectable” also known as a.k.a “missing completely at
random,” “recoverable” a.k.a. “missing at random,” and
“not easily recoverable” a.k.a. “missing not at random.” We clinical team suspects a clinical condition. Therefore, impor-
then design imputation techniques for each type of miss- tant events may be unrecorded, and specific data may not be
ing data. We use a publicly available database (MIMIC II) to available at particular time points. In addition to user errors, the
demonstrate how these imputations perform with random bedside equipment also have error rates. The presence of poor
forests for prediction. Our results indicate that these novel
imputation techniques outperformed standard mean filling quality data in a database adversely affects the downstream
techniques and expectation maximization with a statistical processing and predictive modeling. Data quality pose signifi-
significance p ࣘ 0.01 in predicting ICU mortality. cant challenges to clinical decision support system development
Index Terms—Clinical risk prediction, data imputation,
[8]–[12]. Conventional missing data interpolation and imputa-
intensive care units, missing data, quality control. tion schemes perform poorly because they do not capture how
the data gets missed in the ICU. Imputation of missing values is
I. INTRODUCTION done by using population averages such as the means or medians
HE intensive care unit (ICU) is equipped with a multi- of the features. In addition, deleting records with missing data
T tude of monitoring and therapeutic equipment that gener-
ate large amounts of complex multimodal data [1]. The diversity
and complete case analysis are also prevalent in ICU literature
[13]–[16]. However, deletion reduces samples for analysis, in-
and the number of parameters being monitored in an ICU makes troduces more biases in the data and may not accurately reflect
the resulting databases highly susceptible to data quality issues the underlying disease state.
such as missing data and erroneous data entry [2]–[7]. Despite
comprehensive record-keeping, not all values are maintained A. Types of Missing Data
because tests or measures are primarily recorded only when the In our research we model missing data as three types, preva-
lent in statistics literature [17], (a) Missing Completely at Ran-
Manuscript received February 12, 2016; revised August 31, 2016 and dom (MCAR), (b) Missing at Random (MAR) and (c) Missing
November 2, 2018; accepted November 16, 2018. Date of publication
April 16, 2019; date of current version May 6, 2019. This work was sup- Not at Random (MNAR). Because there is a semantic gap in the
ported in part by Grants from the Health Systems Institute of Georgia actual terminology and its context in the clinical ICU, we re-
Tech and Emory University, in part by the National Institutes of Health phrase these terms as: “Neglectable” for MCAR, “Recoverable”
under Grants U54CA119338 and 1RC2CA148265, in part by the Geor-
gia Cancer Coalition (Distinguished Cancer Scholar Award to Professor for MAR and “Not-Easily-Recoverable (NER)” for MNAR (Ta-
MDW), in part by the Georgia Research Alliance, in part by the Hewlett- ble I, II). Data is classified as “Neglectable” if the probability
Packard, and in part by the Microsoft Research. (Corresponding author: of missing data does not depend either on the missing values or
May D. Wang.)
J. Venugopalan and M. D. Wang are with the Wallace H. Coulter School other observed data. “Neglectable” data can occur in any clini-
of Biomedical Engineering, Georgia Institute of Technology, Emory Uni- cal variable and is independent of observed data. Missing data
versity, Atlanta, GA 30322 USA (e-mail:, jvenugopalan3@gatech.edu; is classified as “Recoverable” if the probability of missing data
maywang@bme.gatech.edu).
N. Chanani and K. Maher are with the Pediatrics Department, Emory depends on the observed values of other features in the dataset.
University, Atlanta, GA 30322 USA (e-mail:, ChananiN@kidsheart.com; Missing data is classified as “NER” if the probability of miss-
MaherK@kidsheart.com). ing data depends on the actual missing values. In other words,
This paper has supplementary downloadable material available at
http://ieeexplore.ieee.org, provided by the authors. given dataset X with missing data in feature Y = [Yobs , Ym iss ],
Digital Object Identifier 10.1109/JBHI.2018.2883606 where Y is composed of both observed data Yobs and missing
2168-2194 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
1244 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 3, MAY 2019

TABLE II
MISSING DATA TYPES, DEFINITIONS & CLINICAL EXAMPLES

data Ym iss , the data is “Neglectable” if Ym iss ⊥X and Ym iss ⊥Y. from the same population, then the data is “Neglectable”. Be-
The data is “Recoverable” if Ym iss ⊥Y but Ym iss X and it is cause the t-test is performed on a single feature at a time, missing
“NER” if Ym iss Y . This is true for all features Y ∈ dataset X. data in other features was discarded. We performed the t-tests
with on each of the ‘f’ features with Bonferroni corrections to
B. Handling of Missing Data in Literature account for multiple testing at a statistical significance of 0.05.
However, for a total number of ‘f ’ features, with a total number
Most studies in health care [6], [18] assume “Neglectable” of ‘f (f − 1)’ comparisons, when ‘f ’ goes up to thousands for
or “Recoverable” assumptions to perform imputations without a large data size, the computation cost is high
quantitative analysis to test the type of missing data, which Thus Little’s test [35], is used to generate Little’s score to dis-
leads to biased results. Existing models for handling missing tinguish “Neglectable” missing data. Little’s score is obtained
data include multiple imputation [6], [17], [19]–[22], expecta- by comparing the means of the original data with the maxi-
tion maximization [23]–[29], and hot – deck imputation [30], mum likelihood imputed data. If the data is “Neglectable”, this
[31] techniques. These techniques are designed to handle “Ne- score follows the chi-square [35]. A p value less than 0.05 re-
glectable” and “Recoverable” data and thus cannot deal with jects the hypothesis that the missing data is “Neglectable”. This
“NER” data [32]. was implemented in IBM Statistical Package for Social Science
In our study, we investigated ICU data and grouped it into (SPSS). Due to memory constraints, we implemented this by
“Neglectable”, “Recoverable” and “NER” data and extended taking features in batches to test “Neglectable”. When a cur-
the clustering-based approach of Tien et al. [17] to perform rent batch was not “Neglectable”, we did not perform further
“Recoverable” data imputation, and developed a copula-based analysis on the batch. However, if any combination was “Ne-
“NER” imputation technique. Our novel “Recoverable” impu- glectable” we combined it with more feature sets (adjacent set)
tation combines the benefit of both expectation maximization and tested again. In order to check whether batch sizes have an
(accounts for distribution) and hot deck techniques (fewer ef- effect on the results, we performed Little’s test using batch sizes
fects due to cross user inconsistencies [33]). Section II describes of 3, 5, 7, and 10. Similarly, we repeated the test on three random
the methods for identifying the types of missing data, and the orderings of the features, in order to check the effect of feature
proposed imputation algorithms, while Section III presents the order on the final results. Following the test for “Neglectable”,
data used, the performance results and discussion. Section IV we performed the test to distinguish the “Recoverable” from
presents the conclusions and future work. “NER” data before imputation.
2) Distinguishing “Not-Easily-Recoverable” From “Recover-
II. METHODS able”: Data is classified as “Recoverable” if the missing data
depends on the other features, and it is classified as “NER” if the
A. Identifying the Type of Missing Data missing data depends on the missing values. Previous research
We first investigated ICU missing data and model it into suggests to use classification schemes to distinguish “Recover-
three groups (Table I, II), “Neglectable,” “Recoverable” and able” data from “NER” data [36], [37]. Cismondi et al. [37] used
“NER.” First, we analyzed missing data to find if they are “Ne- fuzzy classification schemes to distinguish “Recoverable” from
glectable”. If not, then we distinguished between “Recoverable” “NER” data. They proved that non-imputation of “NER” data
and “NER”. If the data is “Recoverable” then we imputed the gives better results and lower bias compared to the imputation
data under “Recoverable” assumptions, else we estimated under of all the values. The labels for training and classification were
“NER” assumptions (Fig. 1): generated for each feature. If the data is missing the value is 1,
1) Test for “Neglectable”: In literature, t-tests [34] and Lit- or otherwise the value is 0. We applied neural networks, support
tle’s test [35] are commonly used to test whether the missing vector machines (SVM), decision trees and LASSO L1 reg-
data is “Neglectable”. By definition, “Neglectable” data refers ularized logistic regression to distinguish “Recoverable” from
to missing data that does not depend either on the missing values “NER”, because they all give a deterministic value each time.
or observed data. Assuming we have ‘f ’ features, for each fea- Any missing data that are correctly classified was considered
ture, we divide the remaining ‘f − 1’ features into two groups. to be “NER” (i.e., missing) and those which were mislabeled
The first group contains missing values in test feature, and the were considered to be “Recoverable” (i.e., imputable). This
second group contains no missing value in the test feature. procedure was repeated for each features. The result of each of
If the results of t-test show that the two groups are sampled the different classifier was a two dimensional binary matrix with

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
VENUGOPALAN et al.: NOVEL DATA IMPUTATION FOR MULTIPLE TYPES OF MISSING DATA IN INTENSIVE CARE UNITS 1245

Fig. 1. Data Imputation Block Diagram. The missing data is divided into 3 types (“Neglectable”, “Recoverable”, and “Not Easily Recoverable”. Then
we impute “Neglectable” & “Recoverable” data using a clustering method & “NER” data by sampling from a student’s copula.

each element containing a binary values indicating whether the for neural networks, 11 from decision trees and 12 for SVM.
data-point is “Recoverable” or “NER”. We report “Pearson’s” Then the top principal components were clustered using k-
correlation between “Recoverable” and “NER” data to reflect means (i.e., a hard clustering technique) and fuzzy-C-means
the agreement between the different classification techniques. (fcm) clustering (i.e., a soft clustering technique) [41]. We ap-
We use the “Pearson’s” correlation because when the results of plied Calinski Harabasz, Davies-Bouldin and silhouette quality
the classification is either 0 or 1, it becomes equivalent to Phi metrics to estimate the optimal cluster number. We used the
coefficient (a measure of association of two binary variables) mean of five repetitions to compute the optimal cluster number
[38] and Matthews correlation coefficient [39]. from each score. We then used a voting principle on the optimal
Then we proceed next to impute the “Recoverable” data and from each score to compute the number of clusters on ICU data.
estimate the “NER” data using our novel methods described This step ensures that robust clustering can be performed even in
below. the presence of noisy and missing data. Then we performed the
imputation in each of the clusters. We run the number of clusters
B. Imputation of “Recoverable” Data from 2-20 in order to ensure data characteristics are captured,
considering the impact of data size and computational cost. To
Imputation of data under “Recoverable” assumptions has
impute each of the clusters in this study, we used expectation
been performed widely in medical literature using expectation
maximization (Fig. 1).
maximization, and multiple imputations. However due to pa-
rameter estimation these techniques tend to have errors and
cross user inconsistencies. Thus, a clustering based imputation D. Estimation of Data Under “Not-Easily-Recoverable”
[33] trained with all the data (as opposed to complete case anal- Assumptions
ysis [17]) is proposed. We propose this approach due to a high
By definition, “NER” data depends on the missing data and
rate of missing data in the ICU. In our study, we propose an
the patterns of missing data. The methods for dealing with this
alternating least squares PCA based clustering approach before
type of data in statistics are selection models, pattern mixture
imputation so that the impact of missing data on the clustering
models [42] and drawn indicator models [43]. All these mod-
is reduced.
els assume a multivariate normal distribution for “NER” data.
However, in the ICU scenario, the joint feature distributions are
C. Robust Clustering Based Imputation
rarely normal.
First, alternating least square based PCA can retrieve “Recov- In this study, we extend drawn indicator models for ICU EHR
erable” missing data in the principal components [40] analysis. missing data imputation. Drawn indicator models use multi-
The total number of principal components were chosen to ac- variate normal distributions to account for the “missingness”
count for 99.99% (chosen to preserve most of the information patterns, in addition to the relationship between the features
content) of the variance. The number of principal components to model the missing data. The “missingness” pattern is de-
accounting for 99.99% of the variance were 12 for LASSO, 11 fined as the distribution of missing data. When a specific data

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
1246 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 3, MAY 2019

is missing, a value of 1 is given and 0 otherwise. When features the copula are maximum likelihood estimates fit using observed
are not normally distributed, multivariate normal distributions data and the “missingness” pattern at a p-value of 0.05, 0.10 and
become unreliable models for imputation. We overcome this 0.15.
issue by using copula functions. A copula function couples N
univariate marginal distributions together to form a joint dis- E. Evaluation of the Imputation Methods Using
tribution function of N standard uniform random variables. It Random Forests
has been shown to be invariant to elliptical distributions with
We use Random Forests (RF) to evaluate all the methods
deviations from normality and thus copes with the lack of strict
in non-temporal in predicting ICU mortality. These new im-
normal distributions. In our study, under “NER” assumption,
putation methods were tested against conventional expectation
we fit a multivariate t-copula as a function of the features and
maximization, mean filling and no filling, when combining Ran-
the “missingness” pattern Rx , (defined as 0 when a certain data
dom Forests in predicting mortality in the ICU. The evaluation
is observed and 1when otherwise) to the data.
metrics include scores accuracy and Mathews correlation co-
A function C: [0, 1] p → [0, 1] is a p-dimensional copula if it
efficient (MCC) and 3 × 3 nested cross-validation. MCC was
satisfies the following properties:
chosen as an evaluation metric because it’s relatively insensitive
1. For all ui ࢠ [0, 1], C(1, . . . , 1, ui, 1, . . . , 1) = ui.
to an imbalance in the population.
2. For all u ࢠ [0, 1] p (i.e., the dim, C(u1, . . . , ud) = 0 if at
least one of the coordinates, ui, equals zero.
3. C is grounded and p-increasing, i.e., the C−measure of III. RESULTS & DISCUSSION
every box whose vertices lie in [0, 1] p is non-negative.
Each of the u is a the marginal distributions of the random A. Data Source – MIMIC-II Database
variables. This study is a retrospective data analysis using data from
Consider p continuous random variables (X1, . . . , Xp) with Multiparameter Intelligent Monitoring in Intensive Care, sec-
copula C. The multivariate copula C is given by ond version, (MIMIC-II) [47] database. MIMIC-II is a public
 t−1
v (u1 )
 t−1
v (up )
ICU data repository containing over 40,000 ICU stay records
C (u1 , u2 . . . , up ) = ··· f (t) dt (1) (32, 331 adult and 8080 neonatal records) [47]. The MIMIC
−∞ −∞ II data for each patient is either static (does not change over
the entire duration of the patient ICU stay, e.g., patient demo-
t−1
v is the inverse of the marginal distribution of the marginals,
graphics) or temporal (changing in time, e.g., heart rate, blood
f (t) denotes the copula function (i.e., for a t-copula it’s a stu-
pressure). We used the temporal data by converting it into values
dent’s t distribution) [44], [45].
averaged over the duration of stay. Then outliers whose values
The standard formulation of a t-copula with two continuous
were physiologically impossible were removed. If the value is
random variables X1 , X2 is defined as follows:
normally distributed, then values that deviated by ±3 standard
 t −1
v (F 1 (X 1 ))
 t −1
v (F 1 (X 2 )) deviations from the mean value were also removed.
C (F1 (X1 ) , F2 (X2 )) = From a total 13,000 features in MIMIC-II database, we ranked
−∞ −∞
 −(v +2)/2 the features by the number of available records. From the top
1 x21 − 2ρx1 x2 + x22 2000 features, we picked 87 features with the greatest clinical
× 1+ dt
2π(1 − ρ2 )1/2 v (1 − ρ2 ) significance (based on clinician input). These included measures
(2) of physiological parameters (e.g., heart rate, blood pressure),
lab results (e.g., WBC, RBC, cholesterol), administrative data
where C is the copula, and F 1 and F2 are marginal distri- (e.g., length of stay, ICD-9), comorbidities and other diagnostic
bution functions (they are obtained from the data and be any procedures. After preprocessing and outlier removal, the total
distribution e.g., binomial, exponential, laplace, Poisson or nor- missing data in the dataset was about 30.05% with range from
mal), ρ and ν are the parameters of the copula to be set dur- 29% to 99%.
ing training, x1 , x2 are samples sampled from the distributions We performed our analysis using adult data from the MIMIC
F1 (X1 ), F2 (X2 ), and t−1
v is the inverse of the standard univari- II database, which consists of 32, 331 adult records. In this
ate student-t-distribution with v degrees of freedom, expectation dataset, there were 2,334 patient records with mortality during
0 and variance v −v 2 [45], [46]. the ICU stay and 29,997 patient records of successful discharge
The continuous copula distribution can be converted into dis- from the ICU. The missing data was 30.6% in patients with
crete copula using the methods described in the paper [46]. In successful discharge from the ICU (population 1) and 22.47%
our formulations, we used MATLAB implementation of copu- in patients with ICU mortality (population 2).
lafit to fit the copula and sample from the copula [46].
For each feature with missing data Yi , it is then sampled from
a distribution given by B. Test for “Neglectable” Assumption

Yi ∼ C (F1 (X1 ) , F2 (X2 ), . . . , FN (XN ), FN +i (Ri )) (3) We performed the t-tests and Little’s test. The results of
the t-test (Fig. 2) demonstrate that most of the p-values re-
where X = [X1 , X2 , . . . , XN ] is the data with N features and ject the null hypothesis that states that the data is “Neglectable”,
Ri is the missingness pattern for feature Yi . The parameters for indicating the ICU data is not “Neglectable.” It is supported

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
VENUGOPALAN et al.: NOVEL DATA IMPUTATION FOR MULTIPLE TYPES OF MISSING DATA IN INTENSIVE CARE UNITS 1247

Fig. 3. “NER” data identification: The red bar gives percentage of all
missing data the green bars give the “NER” data percentage. The fea-
tures which have no missing data do not have any bars.

performing temporal analysis could account for the high “NER”


data. In the temporal analysis, the percentages of “NER” may
be significantly lower due to a higher correlation between ad-
Fig. 2. t-test results: Green row means that particular feature is
“Neglectable”. Since no row is green, data is not “Neglectable”. jacent time points. As mentioned above, the data in the ICU is
collected only when a clinical team suspects a specific clinical
condition and not all data is collected at all times. This results
TABLE III
LITTLE’S TEST RESULTS (BATCH-SIZE = 5) WHICH PROVES ICU in data relating to a subset of conditions being collected in cer-
DATA IS NOT “NEGLECTABLE” tain sub-populations. This data gathering approach supports our
results which show a very high levels of missing data. These
results indicate that most conventional approaches of imputing
all the data using “Recoverable” assumptions or deleting may
lead to bias. The high levels of “NER” make the prediction
and data interpretation challenging. Therefore, we perform es-
timation of data under “NER” assumptions. The “Pearson’s”
correlation between the percentages of “NER” data in the popu-
lations of patients discharged and deceased was found to be was
0.98 and that of “Recoverable” data was 0.87. Results indicate
that the distribution of missing data in the two populations was
similar (Please find more details in the supplementary material,
Fig. 1, 2).

D. Evaluation Using Random Forests


by the results of the Little’s test which showed that the dataset
Evaluation of the imputation models was performed using
is not “Neglectable” (batch size = 5 in Table III). Our results
Random Forests to predict ICU mortality. The models where
from other batches also indicate ICU data is not “Neglectable”.
“NER” data was imputed using copulas outperformed all the
Similarly, the results were not affected by reordering features
other models. The k-means based “Recoverable” models out-
(more details in the supplementary material).
performed traditional EM models (Table IV(a, b)), mean filling
and no filling techniques. The MCC and accuracy of the novel
C. Identifying “NER” Data
models were similar irrespective of the classification technique
The classification for “NER” was performed using neural used to distinguish the “Recoverable: from “NER” data. The
networks, SVM, decision trees and LASSO L1 regularized lo- statistical significance of these models was tested using Stei-
gistic regression. The results from all the methods are similar gler’s Z score [48] for correlated correlations from the MCC
and shows “Pearson’s” correlation coefficient greater than 0.9 scores.
(Table VII in supplementary material). The classification anal- On comparing the prediction performance of our novel meth-
ysis (Fig. 3) shows very high levels of “NER.”(33.2% of the ods for statistical significance, we found that all the novel models
data and 99% of all missing data). The fact that we are not which impute “NER” outperformed EM algorithm and mean

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
1248 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 3, MAY 2019

TABLE IVA
MCC VALUES FROM RF COMPARING IMPUTATION. NER DATA IMPUTATION GAVE STATISTICALLY SIGNIFICANT IMPROVEMENT AT P ࣘ 0.01

TABLE IVB
ACCURACY VALUES FROM RF COMPARING IMPUTATION. NER DATA IMPUTATION GAVE STATISTICALLY SIGNIFICANT IMPROVEMENT AT P ࣘ 0.01

TABLE V
TOP 5 FEATURES IN EACH OF THE MODELS. THE FEATURES IN BOLD HAVE ALSO BEEN REPORTED BY OTHER STUDIES USING MIMIC-II FOR MORTALITY.
HERE SAPS IS SIMPLIFIED ACUTE PHYSIOLOGY SCORE AND SOFA IS SEQUENTIAL ORGAN FAILURE ASSESSMENT

filling imputation techniques with a statistical significance of (AIDS, metastatic cancer, hematologic malignancy) and type of
p ࣘ .01. All the proposed novel data handling techniques were admission (elective surgery, medical, unscheduled surgery) [57]
shown to be performing better than no data imputation with a have been shown to be associated with mortality from other
statistical significance of p ࣘ .01. The repetitions of NER meth- studies using the MIMIC-II dataset.
ods with the different p-value parameters (0.05, 0.10 and 0.15)
all gave MCC values greater than 0.55 and accuracy greater
than 0.95. These results prove that division of missing data IV. CONCLUSION & FUTURE WORK
into “Neglectable”, “Recoverable” and “NER” and the novel Data quality poses significant challenges to decision support
imputation methods give a better performance as compared to systems. Conventional missing data interpolation and imputa-
current strategies of EM, mean filling, and no filling. We also tions perform poorly because there are no models for modeling
compared the performance of our methods with that of multivari- how the data is missing. In our study, we described the miss-
ate normal distributions and found that results of our t-copula ing type into three categories “Neglectable”, “Recoverable” and
methods were comparable with those of multivariate normal “NER”. We demonstrated ICU data is not “Neglectable” and any
(MVN) distributions (MCC = 0.55 ± 0.02 for k-means + MVN deletion would result in bias. We then proposed novel imputa-
and MCC = 0.55 ± 0.01 for fcm + MVN). Interestingly, the tion for “Recoverable” and “NER” missing data types. Overall
features that were seen to be most indicative of mortality are our technique gave statistically significant (p ࣘ .01) improve-
very similar irrespective of the imputation methods. Top rank- ment in performance. Our work, however, suffers from some
ing features predicted using our model (Table V) such as SAP limitations such as 1) it uses only non-temporal data for per-
scores, long length of ICU stay, SpO2, comorbidities and SOFA forming analysis, 2) we demonstrated our result only on one
scores have been clinically shown to be correlated with mor- clinical end point (adult mortality) and 3) the use of only a cou-
tality [47], [49]–[55]. The features such as SAPS-I [47], ABP ple of clustering techniques. In the future, we will expand our
[56], age, heart rate, systolic blood pressure, body tempera- work to include temporal data and other additional endpoints
ture, Glasgow Coma Scale, mechanical ventilation, PaO2, FiO2, such as ICU readmission and length of ICU stay. We also hope
urine output, BUN (blood urea nitrogen), blood sodium, potas- to compare the performance of our methods with other clinical
sium, bicarbonates, bilirubin, white blood cells, chronic disease datasets with different variations and in pediatric populations

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
VENUGOPALAN et al.: NOVEL DATA IMPUTATION FOR MULTIPLE TYPES OF MISSING DATA IN INTENSIVE CARE UNITS 1249

from Children’s Healthcare of Atlanta after IRB approval. In [20] M. Sun et al., “Extent of lymphadenectomy does not improve the survival
the future, we will extend our study to improve the robustness of patients with renal cell carcinoma and nodal metastases: Biases asso-
ciated with the handling of missing data,” BJU Int., vol. 113, pp. 36–42,
of the clustering through the use of techniques such as genetic 2014.
algorithms [41] and the use of trimming procedures [58]. We [21] D. Mavridis, A. Chaimani, O. Efthimiou, S. Leucht, and G. Salanti, “Ad-
will also investigate hierarchical and density based clustering, in dressing missing outcome data in meta-analysis,” Evidence Based Mental
Health, vol. 17, pp. 85–89, 2014.
addition to particle swarm optimization [1] and missing interval [22] J. Labarère, R. Bertrand, and M. J. Fine, “How to derive and validate
size [2], which directly utilize patient data for clustering. clinical prediction models for use in intensive care medicine,” Intensive
Care Med., vol. 40, pp. 513–527, 2014.
[23] K. A. Hallgren and K. Witkiewitz, “Missing data in alcohol clinical trials:
REFERENCES A comparison of methods,” Alcoholism, Clin. Exp. Res., vol. 37, pp. 2152–
2160, 2013.
[1] L. Anthony Celi, R. G. Mark, D. J. Stone, and R. A. Montgomery, “‘Big [24] I. B. Aydilek and A. Arslan, “A hybrid method for imputation of missing
data’ in the intensive care unit. Closing the data loop,” Amer. J. Respiratory values using optimized fuzzy c-means with support vector regression and
Crit. Care Med., vol. 187, pp. 1157–1160, 2013. a genetic algorithm,” Inf. Sci., vol. 233, pp. 25–35, 2013.
[2] G. D. Clifford, W. J. Long, G. B. Moody, and P. Szolovits, “Robust [25] B. M. Marlin, D. C. Kale, R. G. Khemani, and R. C. Wetzel, “Unsupervised
parameter extraction for decision support using multimodal intensive care pattern discovery in electronic health care data using probabilistic clus-
data,” Philosoph. Trans. Roy. Soc. A, Math., Phys. Eng. Sci., vol. 367, tering models,” presented at the 2nd ACM SIGHIT Int. Health Informat.
pp. 411–429, Jan. 28, 2009. Symp., New York, NY, USA, 2012.
[3] T. Botsis, G. Hartvigsen, F. Chen, and C. Weng, “Secondary use of EHR: [26] J. Zhang and S. Gong, “Action categorization with modified hidden con-
Data quality issues and informatics opportunities,” Proc. AMIA Summits ditional random field,” Pattern Recognit., vol. 43, pp. 197–203, 2010.
Transl. Sci., vol. 2010, pp. 1–5, 2010. [27] H. P. Blumberg et al., “Preliminary evidence for persistent abnormali-
[4] N. G. Weiskopf and C. Weng, “Methods and dimensions of elec- ties in amygdala volumes in adolescents and young adults with bipolar
tronic health record data quality assessment: Enabling reuse for clin- disorder,” Bipolar Disorders, vol. 7, pp. 570–576, Dec. 2005.
ical research,” J. Amer. Med. Informat. Assoc., vol. 20, pp. 144–151, [28] M. Bennewitz, W. Burgard, G. Cielniak, and S. Thrun, “Learning motion
2013. patterns of people for compliant robot motion,” Int. J. Robot. Res., vol. 24,
[5] S. McPherson, C. Barbosa-Leiker, M. McDonell, D. Howell, and J. Roll, pp. 31–48, Jan. 1, 2005.
“Longitudinal missing data strategies for substance use clinical trials using [29] L. Wang, W. Hu, and T. Tan, “Recent developments in human motion
generalized estimating equations: An example with a buprenorphine trial,” analysis,” Pattern Recognit., vol. 36, pp. 585–601, 2003.
Hum. Psychopharmacol., Clin. Exp., vol. 28, pp. 506–515, 2013. [30] A. Mackinnon, “The use and reporting of multiple imputation in medical
[6] L. R. Zelnick et al., “Addressing the challenges of obtaining functional research—A review,” J. Internal Med., vol. 268, pp. 586–593, 2010.
outcomes in traumatic brain injury research: Missing data patterns, timing [31] J. M. Jerez et al., “Missing data imputation using statistical and machine
of follow-up, and three prognostic models,” J. Neurotrauma, vol. 31, learning methods in a real breast cancer problem,” Artif. Intell. Med.,
pp. 1029–1038, 2014. vol. 50, pp. 105–115, 2010.
[7] A. K. Jha et al., “Use of electronic health records in U.S. hospitals,” New [32] A. C. Grobler, G. Matthews, and G. Molenberghs, “The impact of missing
Engl. J. Med., vol. 360, pp. 1628–1638, 2009. data on clinical trials: A re-analysis of a placebo controlled trial of (St
[8] P.-Y. Wu, C.-W. Cheng, C. D. Kaddi, J. Venugopalan, R. Hoffman, and Johns wort) and sertraline in major depressive disorder,” Psychopharma-
M. D. Wang, “—Omic and electronic health record big data analytics for cology, vol. 231, pp. 1987–1999, 2014.
precision medicine,” IEEE Trans. Biomed. Eng., vol. 64, no. 2, pp. 263– [33] R. R. Andridge and R. J. A. Little, “A review of hot deck imputation for
273, Feb. 2017. survey non-response,” Int. Statist. Rev., vol. 78, pp. 40–64, 2010.
[9] T. H. Stokes, R. A. Moffitt, J. H. Phan, and M. D. Wang, “Chip artifact [34] R. J. Little, “Modeling the drop-out mechanism in repeated-measures
CORRECTion (caCORRECT): A bioinformatics system for quality assur- studies,” J. Amer. Statist. Assoc. vol. 90, pp. 1112–1121, 1995.
ance of genomics and proteomics array data,” Ann. Biomed. Eng., vol. 35, [35] R. J. Little, “A test of missing completely at random for multivariate data
pp. 1068–1080, 2007. with missing values,” J. Amer. Statist. Assoc., vol. 83, pp. 1198–1202,
[10] T. H. Stokes, J. Torrance, H. Li, and M. D. Wang, “ArrayWiki: An en- 1988.
abling technology for sharing public microarray data repositories and [36] S. Nakagawa and R. P. Freckleton, “Missing inaction: The dangers of ig-
meta-analyses,” BMC Bioinformat., vol. 9, p. S18, 2008. noring missing data,” Trends Ecol. Evol., vol. 23, pp. 592–596, Nov. 2008.
[11] S. Kothari, J. H. Phan, A. O. Osunkoya, and M. D. Wang, “Biological [37] F. Cismondi, A. S. Fialho, S. M. Vieira, S. R. Reti, J. M. Sousa, and
interpretation of morphological patterns in histopathological whole-slide S. N. Finkelstein, “Missing data in medical databases: Impute, delete or
images,” in Proc. ACM Conf. Bioinformat., Comput. Biol. Biomed., 2012, classify?,” Artif. Intell. Med., vol. 58, pp. 63–72, May 2013.
pp. 218–225. [38] G. M. Kuhn, “The phi coefficient as an index of ear differences in dichotic
[12] S. Kothari, J. H. Phan, T. H. Stokes, and M. D. Wang, “Pathology imaging listening,” Cortex, vol. 9, pp. 450–457, 1973.
informatics for quantitative analysis of whole-slide images,” J. Amer. Med. [39] S.-S. Choi, S.-H. Cha, and C. C. Tappert, “A survey of binary similarity
Informat. Assoc., vol. 20, pp. 1099–1108, 2013. and distance measures,” J. Systemics, Cybern. Informat., vol. 8, pp. 43–48,
[13] S. Hunziker, L. Celi, J. Lee, and M. Howell, “Red cell distribution width 2010.
improves the simplified acute physiology score for risk prediction in uns- [40] H. A. Kiers and J. M. Ten Berge, “Alternating least squares algorithms for
elected critically ill patients,” Crit. Care, vol. 16, p. R89, 2012. simultaneous components analysis with equal component weight matrices
[14] B. Mitra, M. Fitzgerald, and J. Chan, “The utility of a shock index ࣙ1 in two or more populations,” Psychometrika, vol. 54, pp. 467–473, 1989.
as an indication for pre-hospital oxygen carrier administration in major [41] R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Trans.
trauma,” Injury, vol. 45, pp. 61–65, 2014. Neural Netw., vol. 16, no. 3, pp. 645–678, May 2005.
[15] I. Cho, I. Park, E. Kim, E. Lee, and D. W. Bates, “Using EHR data [42] B. Michiels, G. Molenberghs, L. Bijnens, T. Vangeneugden, and H. Thijs,
to predict hospital-acquired pressure ulcers: A prospective study of a “Selection models and pattern-mixture models to analyse longitudinal
Bayesian network model,” Int. J. Med. Informat., vol. 82, pp. 1059–1067, quality of life data subject to drop-out,” Statist. Med., vol. 21, pp. 1023–
2013. 1041, 2002.
[16] M. M. Pollack, K. M. Patel, and U. E. Ruttimann, “PRISM III: An updated [43] M. O’Kelly and B. Ratitch, “Analyses under missing-not-at-random as-
pediatric risk of mortality score,” Crit. Care Med., vol. 24, pp. 743–752, sumptions,” in Clinical Trials With Missing Data. New York, NY, USA:
May 1996. Wiley, 2014, pp. 257–368.
[17] J. Tian, B. Yu, D. Yu, and S. Ma, “Missing data analyses: A hybrid [44] J. Gatz, “Master theses: Properties and applications of the student T cop-
multiple imputation algorithm using gray system theory and entropy based ula,” M.S. thesis, 2007.
on clustering,” Appl. Intell., vol. 40, pp. 376–388, 2014. [45] C. Smart and C. Director, Beyond correlation: Don’t use the formula that
[18] P. Jenkins and J. Welton, “Measuring direct nursing cost per patient killed wall street, Missile Defense Agency, Fort Belvoir, VA, USA.
in the acute care setting,” J. Nursing Admin., vol. 44, pp. 257–262, [46] E. Bouyé, V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli,
2014. “Copulas for finance—A reading guide and some applications,” Financial
[19] D. Macrae et al., “A randomized trial of hyperglycemic control in pediatric Econometrics Research Centre, City University Business School, London,
intensive care,” New Engl. J. Med., vol. 370, pp. 107–118, 2014. U.K., 2000.

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.
1250 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 23, NO. 3, MAY 2019

[47] M. Saeed et al., “Multiparameter intelligent monitoring in intensive care [54] L. Mayaud, P. S. Lai, G. D. Clifford, L. Tarassenko, L. A. G. Celi, and D.
II (MIMIC-II): A public-access intensive care unit database,” Crit. Care Annane, “Dynamic data during hypotensive episode improves mortality
Med., vol. 39, pp. 952–960, 2011. predictions among patients with sepsis and hypotension,” Crit. Care Med.,
[48] J. H. Steiger, “Tests for comparing elements of a correlation matrix,” vol. 41, pp. 954–962, 2013.
Psycholog. Bull., vol. 87, pp. 245–251, 1980. [55] M. L. Vold, U. Aasebø, T. Wilsgaard, and H. Melbye, “Low oxygen
[49] V. K. Moitra, C. Guerra, W. T. Linde-Zwirble, and H. Wunsch, “Relation- saturation and mortality in an adult cohort: The Tromsø study,” BMC
ship between ICU length of stay and long-term mortality for elderly ICU Pulmonary Med., vol. 15, p. 9, 2015.
survivors,” Crit. Care Med., vol. 44, pp. 655–662, 2016. [56] Y. Chen and H. Yang, “Heterogeneous postsurgical data analytics for
[50] W. A. Knaus, D. P. Wagner, J. E. Zimmerman, and E. A. Draper, “Varia- predictive modeling of mortality risks in intensive care units,” in Proc.
tions in mortality and length of stay in intensive care units,” Ann. Internal 2014 36th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2014, pp. 4310–
Med., vol. 118, pp. 753–761, 1993. 4314.
[51] T. Williams, K. Ho, G. Dobb, J. Finn, M. Knuiman, and S. Webb, “Effect [57] R. Pirracchio, “Mortality prediction in the ICU based on MIMIC-II results
of length of stay in intensive care unit on hospital and long-term mortality from the super ICU learner algorithm (SICULA) project,” in Secondary
of critically ill adult patients,” Brit. J. Anaesthesia, vol. 104, pp. 459–464, Analysis of Electronic Health Records. New York, NY, USA: Springer,
2010. 2016, pp. 295–313.
[52] É. Azoulay et al., “Dexamethasone in patients with acute lung injury [58] L. A. Garcı́a-Escudero, A. Gordaliza, C. Matrán, and A. Mayo-Iscar,
from acute monocytic leukemia,” Eur. Respiratory J., vol. 39, pp. 648– “A review of robust clustering methods,” Adv. Data Anal. Classification,
653, 2011. vol. 4, pp. 89–109, 2010.
[53] F. L. Ferreira, D. P. Bota, A. Bross, C. Mélot, and J.-L. Vincent, “Serial
evaluation of the SOFA score to predict outcome in critically ill patients,”
JAMA, vol. 286, pp. 1754–1758, 2001.

Authorized licensed use limited to: ULAKBIM UASL - DOKUZ EYLUL UNIVERSITESI. Downloaded on February 02,2023 at 17:39:39 UTC from IEEE Xplore. Restrictions apply.

You might also like