You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/307938384

Assessing the predictability of hospital readmission using machine learning

Article · January 2013

CITATIONS READS
13 292

5 authors, including:

Masoumeh Tabaeh Izadi Aman Verma


McGill University McGill University
31 PUBLICATIONS   133 CITATIONS    43 PUBLICATIONS   551 CITATIONS   

SEE PROFILE SEE PROFILE

Doina Precup David L Buckeridge


McGill University McGill University
259 PUBLICATIONS   5,358 CITATIONS    360 PUBLICATIONS   4,293 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Options framework View project

Population Health Record View project

All content following this page was uploaded by Masoumeh Tabaeh Izadi on 20 September 2016.

The user has requested enhancement of the downloaded file.


Predictability Analysis of Hospital Readmission Using Machine Learning
Techniques

Arian Hosseinzadeh, Masoumeh Izadi, Aman Verma, Doina Precup, and David Buckeridge
McGill University
1140 Pine Ave West, Montreal, QC

Abstract other cases readmission is due to some system failure and so


the associated costs could potentially be recovered.
Unplanned hospital readmissions raise health care costs
and cause significant distress to patients. Predicting at Some of the diverse causes of hospital readmissions in-
risk patients is of great interest in order to diminish clude patient-level factors such as age and multiple chronic
readmission rate and to lower its impact. The fragmen- conditions; hospital and health system-level factors such as
tary state of the current predictive methods for hospital the timeliness of post-discharge followup, nurses workload,
readmission prevent their generalizability. Most predic- coordination of care after discharge; or medical errors and
tive models for readmission risk for clinical purposes adverse events that occurred during the initial hospitaliza-
are developed based on clinical data at institution level. tion. To reduce hospital readmissions due to providing poor
Methods developed for comparison between healthcare care, hospitals are penalized for excessive rates of readmis-
settings use only limited administrative information and
sions according to new policies for providing low quality
have low predictive power. In this paper, we consider a
large number of administrative information from claim care (?; ?). However, the effectiveness of such policies is not
data on patients demographics, dispensed drugs, med- known. Besides, in practice, the evidence describing the re-
ical or surgical procedures performed, and medical di- lationship between the hospitals role in readmission remains
agnosis in the prediction of readmission as a supervised fragmented and mainly qualitative. Unfortunately, whatever
learning task. Our objective is to gain knowledge about the cause, the risk of future readmission is considerable and
the level of predictability that such information pro- the need to improve our understanding about the determi-
vide. Our preliminary results on data from provincial nants of readmission is clear. Knowledge about readmission
hospital systems in Quebec illustrate the potential for risk factors could also be used to help target more focused
this approach to reveal important information on fac- delivery of care and resources to the patients at greatest risk
tors from administrative data that trigger hospital read-
and provide a better quality healthcare.
mission. Our findings suggest that substantial portion of
readmissions are hard to predict by their nature. Conse- Many factors are suggested through years by researchers
quently, the use of crude readmission rate as an indica- as potentially important in prediction of hospital readmis-
tor of the quality of provided care might not be appro- sion but the utility of such factors has not been widely stud-
priate as all readmissions are not equally likely. ied due to data accessibility issues and methodological is-
sues. Typically, risk factors for readmission are identified
Introduction using traditional hypothesis-driven statistical methods such
as logistic regression. However, limitation become apparent
A hospital readmission is defined as an admission to a hos- in these approaches as the scope of these studies expand to
pital or a healthcare setting within a certain time frame, fol- include huge range of variables.
lowing an original hospital stay. A readmission can occur at In this research, we aim to improve the understanding of
either the same hospital or a different hospital and can in- determinants of readmission using a large number of vari-
volve planned or unplanned surgical or medical treatments ables including drug classification codes, diagnostic codes,
(?). Different time frames have been used for the analysis medical and surgical procedure codes. We deploy machine
of readmission in the literature (?), however, researches of- learning algorithms that can coherently incorporate this in-
ten define readmissions as referring to hospital admissions formation in the parameters of the model and potentially
within 30 days following the initial discharge. Readmis- overcome the analytical challenges that hypothesis driven
sion contribute to a significant proportion of total inpatient methods face in this situation. The inclusion of prescrip-
spending in many countries. In the United States this ac- tion drugs in this research is based on the hypothesis that
counts for $17.4 billion per year (?) and in Canada readmis- the type, the variety, and the dose of drugs can indicate pa-
sions to acute care cost an estimated $1.8 billion per year (?). tient’s health status or severity of illness. There are consen-
While in many cases readmission is an unavoidable cost, in suses that use of appropriate drugs can reduce the chance
Copyright c 2013, Association for the Advancement of Artificial of readmissions. For instance, in one study it was illustrated
Intelligence (www.aaai.org). All rights reserved. that patients with asthma who receive regular use of inhaled
corticosteroids would be associated with reductions of 31% ing 30-day readmission (?). Some researchers believe that
hospital admissions and 39% of readmission (?). However, it models which consider factors such as medical comorbidi-
is not clear which medications can be effective in prediction ties and basic demographic data are much better able to pre-
of readmission. dict mortality and unavoidable readmission rather than read-
The dimensionality of the data being over twenty thou- mission in general(?; ?; ?). There are not many validation
sands still poses some challenges for the classical machine studies to show which readmission records are unavoidable.
learning methods. Therefore, another step in our model We could only find one validated prediction model that ex-
building is investigating a dimensionality reduction tech- plicitly examined potentially preventable readmissions as an
nique to select among hundreds of thousands of features. outcome, and it found only about one-quarter of readmis-
Taking into account the entire spectrum of drugs in a low- sions were clearly preventable (?).
level of drug classification hierarchy into a predictive model It has been argued that incorporating different observa-
is not feasible due to the dimensions of these attributes. tions such as the attributes related to socioeconomic sta-
While grouping of drugs based on pharmaceutical classi- tus, and patients overall health can provide novel insights
fication code systems in an upper-level hierarchy seems to into causes for readmission(?). Similarly, researchers sug-
lift up the computational limitations of considering drugs as gest that analysis of readmission based on prescription drugs
predictors of readmission, there is no evidence on how ef- can potentially provide more understanding of the disease
ficiently this grouping works in practice. In this paper for severity and impact of drugs on predicting readmission (?).
the first time we try to investigate the role of classification Few studies considered variables associated with the sever-
code systems for drugs, diagnosis, and procedure in making ity of illness, patient’s health and function, and social deter-
it possible to encode these attributes in a prediction model. minants of health. However, relatively little work has been
It will be critically important to understand how the reduc- done in the literature on formal development of models that
tion affects the performance. The efficiency and effective- describe the likely patterns of drug usage which increase the
ness of a number of dimensionality reduction methods is risk of readmission. Recent developments in analyzing big
demonstrated through extensive comparisons over the study data by Microsoft researchers has allowed a decision support
data. The comprehensiveness of variables selected for this system, Amalga, to analyze 25000 clinical and administra-
research is novel. In particular, methods that investigate the tive variables and predict hospital patients readmission risk
role of medications have not been explored yet. The use score in an applied machine learning setting in the Greater
of prescription medications can also be helpful in revealing Washington, DC, Metropolitan area (?). There are still no
possible medical errors related to appropriateness of drug evidence in the literature on validation of this system and
utilization and the medication reconciliation at the care pro- its performance for clinical purposes. However, the access
viding institution, although this topic is not the focus of this to clinical data such as vital signs, lab results or information
paper. related to the physicians in charge is not easy to obtain for
general health services analysis.
Current State of Hospital Readmission Models Most models created to date for hospital comparison or
In this section we review the existing hospital readmis- clinical purposes have poor predictive ability which prevents
sion prediction models with respect to the type and di- their generalization (?; ?; ?). The area under the ROC curve
mension of factors they consider and the methodology uti- for performance reported in the studies reviewed in the lit-
lized. A review of published studies on readmission anal- erature are in range of 0.61 () to 0.68 (). Developing richer
ysis shows that most studies in the literature focus primar- models would require insights about the causality of read-
ily on the role of demographic information such as age and mission and mechanisms of provided care.
gender, medical comorbidity, and prior health services uti-
lization and prior hospitalizations in readmission risk (?; ?; Supervised Learning Approach
?). Data from these studies show that readmission rates are
associated with age, patient comorbidities, and other factors We approached readmission prediction as a supervised
such as length of stay in the hospital. The likelihood of a learning task. Building on the analysis of readmission in the
readmission increases with the patient history of medical literature, we use patients demographics, history of readmis-
readmissions. Highest readmission rates have been observed sions, procedure codes related to medical or surgical proce-
in geriatric patients, mainly with heart failure and chronic dures performed during the hospital stay, diagnostic codes
obstructive pulmonary disease(?). However, the specific rea- that indicate the leading disease and comorbidities in the
sons such persons are readmitted still need further explo- patient, in addition to prescription drugs with drug classi-
ration. Some studies used an index, named LACE, to score fication codes at the aggregate level to discover the risk of
the risk of readmission (?; ?) in clinical settings. LACE de- readmission for each hospital admission record. Hospital
fined as length of stay L; acuity of the admission A; co- admission record is considered as the unit of analysis. We
morbidity of the patient (measured with the Charlson co- treat hospital admission records as a set of parameter val-
morbidity index score) C; and emergency department use ues, which must be classified as either positive or negative
(measured as the number of visits in the six months before examples leading to future readmissions.
admission)E. The value of this index is zero for no risk or As we used administrative data from the hospital systems,
higher up to 19 for more risk of future readmissions. How- there is a gold standard for each patient over the age of the
ever, this index is considered to be a poor tool for predict- cohort that identifies whether or not each patient readmitted
following a discharge. Therefore, we can label the data per- lary Service, AHFS, classification system for pharmaceutical
fectly and frame the problem as a classical machine learning codes. The AHF S Pharmacologic-Therapeutic Classifica-
problem of learning from examples. We examined a classifi- tion was developed and is maintained by the American So-
cation methods from the category of generative classifiers, a ciety of Health-System Pharmacists (ASHP). For the proce-
Naive Bayes classifier, and one from the category of discrim- dural codes we used Chapter, the Canadian Classification of
inative classifiers, a decision tree classifier. The motivation Diagnostic, Therapeutic, and Surgical Procedures. We used
for the choice of algorithms is the fact that both Bayesian ICD − 9 diagnostic codes and grouped them into the co-
methods and decision trees allow us to visualize which fea- morbidities described by Elixhauser.
tures are important in the prediction of readmission ensuring
that domain experts can interpret the results in the context of Identification of hard-to-predict cases
their existing knowledge.
Like any classification problem, there are easy to predict and
Preprocessing hard to predict cases in analysis of readmission prediction.
One of our objectives in this research is to identify partic-
Feature selection as an active field of research in the high- ularly the readmission records that are hard to detect at the
dimensional data analysis offers a variety of different meth- discharge point of the initial admission. We hypothesize that
ods for reducing the dimensionality of the data. No con- the readmission cases in the data that are always missed
sensus exists on how to distinguish among broad range of when we use different classifiers and different feature spaces
feature reduction methods the ones that fit our problem the should account for these hard-to-predict cases. These diffi-
best. However, the interpretability of the results are impor- cult cases could imply that either there are some confound-
tant in our application. Therefore, we cannot use dimen- ing factors that we have not measured in any of our models
sionality reduction techniques such as principal component or they are outliers. Whether or not the hard-to-predict cases
analysis or its variations for this problem. Therefore, in this account for unavoidable readmissions is not in the scope of
paper we used statistical methods from information-based this research. However, further analysis of these cases and
and frequency-based approaches in the literature to reduce possible characterization of their attributes is in the area of
the number of features. Both methods take as input a ma- our future explorations. Naturally, the cleaned dataset after
trix of admission records data for two feature categories: removing the potential outliers should improve the readmis-
patient demographics and codes related to drug-diagnosis- sion prediction performance. In our experimental results we
procedure and return a small set of features from drug- observe how much the performance of each classifier im-
diagnosis-procedure category attached to the corresponding proves by doing such.
demographic features.
From the class of information-based feature reduction
methods, we used Gini indexing which is a standard measure
Hospital Discharge Data Set
of statistical dispersion with the value between zero and one. We used a large cohort extracted from the Quebec admin-
Gini index is commonly used in economics as a measure of istrative database of hospitalization information, obtained
inequality of income. Higher the number over 0 higher the from the Rgie de l’assurance maladie du Qubec (RAMQ).
inequality. In the context of our application, the value of 0 We enrolled patients into this cohort after their first diagno-
for a feature shows that all the members in the dataset belong sis of a respiratory illness between January 1st, 1996 and De-
to the same class and therefore we can get the maximum cember 31, 2006 while living in in the census metropolitan
useful information from this feature, when the value of one area of Montreal, as defined by the 2006 Canadian census.
shows that the samples in the data are distributed equally These data provide complete information on age, sex, cen-
over the class and we can not gain much information from sus tract of residence, hospital procedures that results in
this feature. The second statistical feature selection method 2733 procedure-related features, hospital diagnostic codes
is frequency-based feature selection , that is, selecting the that results in 3738 diagnosis-related features, outpatient
features that are most common in the class. This method is physician visits, outpatient diagnostic codes, and emergency
meant for reducing the dimensionality of hospital admission department visits. Since Quebec provides universal provin-
dataset based on ranking frequency counts for each drug- cial drug coverage for people 65 years or older, these data
diagnosis-procedure feature. include all drug prescriptions for these people. This re-
We are also interested to investigate the effect of natu- sults in 5920 features related to the drugs. Planned hospi-
ral grouping of features based on medical taxonomies of tal admissions have a special code to distinguish them from
drugs, procedures, and diagnosis on the classification per- unplanned admissions. All patient, physician, hospital and
formance. We refer to this approach as domain-knowledge- clinic names were anonymized. For patients, the birth-month
based method. There exist a variety of conceptual level was available, but not the birthday.
medical classification for medications, for ICD − 9 diag- Our dataset include all Quebec healthcare utilization data
nostic codes and for medical-surgical procedure codes that for all included patients after enrolment, even if they moved
provides a method to systematically categorize the codes to another Quebec location. Therefore, hospitalizations out-
falling into these attributes. A conceptual medical classi- side of Montreal were included. Since these data do not in-
fication system represents an internationally standard way clude the hospital name or location, we could not distin-
to describe and compare medical utilization data. In our guish between Montreal hospitals and other Quebec hospi-
analysis for this paper, we used American Hospital Formu- tals. We examined discharges from the twenty hospitals with
the most discharges (of patients 65 years or older), which are grouping when the same number of features are selected. We
almost certainly Montreal hospitals. also observed that between the two classification algorithms,
In this research, we included all discharges from the there is no winner as both classifiers perform similarly.
twenty hospitals described above for all enrolled patients Although the performance of the Naive Bayes classifiers
who were 65 years of age or older at the time of discharge. and the decision tree classifiers in our experiments were not
In these data, 619,274 hospital discharges met these criteria. too different, we still used both classifiers in experimental
In order to have a sense of the distribution of prescription settings with a range of dimensionsionalities with respect
medications among patients with positive and negative read- to both gini indexing and frequency ranking, as well as the
mission record, Table 1 present some descriptive statistics domain-knowledge-based approach for feature selection to
on our dataset that compares prescription medication at the gain some insights about the hard-to predict readmission
time of admission between those who were eventually read- cases (outliers). This includes 21 different experimental set-
mitted and those who were not. tings. We identified the readmitted records which are missed
The dataset in our study poses analytical challenges for by all these 21 classification experiments as outliers. This
machine learning methods with respect to dimensionality of includes only 6% of the entire dataset. We examined the
the data records. As we explained in the previous section, performance of Naive Bayes classifiers after removing the
there are more than twenty thousands of features related to hard-to-predict cases from the dataset. These results are pre-
the diagnostic codes, drug utilization codes, and procedure sented in the third column of Table 2.
codes plus the demographics of the patients. This makes the Some descriptive statistics on the original dataset and
dataset very large and sparse. While we do not anticipate that the hard-to-predict cases (outliers) are presented in Table 3.
all of these attributes are distinct predictors of readmission, These features are chosen based on their popularity in the lit-
there is not sufficient evidence, however, to determine fac- erature for being a predictor of hospital readmission. These
tors that might be important and those that might not. There- results show that there is a somewhat significant difference
fore, we applied a pre-processing step using feature reduc- between the general readmitted group and outliers in several
tion techniques in order to make it computationally easier variables including emergency admission, length of stay, and
for the classifiers. previous readmissions. For some variables, the outlier group
are similar to not readmitted group, including length of stay
Empirical Results and emergency admission. We studied limited number of
In measuring the prediction performance, we ran a large features according to gini indexing and frequency ranking in
number of experiments using the rapidminer software (?). the previous experiments. In another set of experiments, we
Each experiment assigned a different learning algorithm and are interested to see what happens if we increase the number
a different feature selection technique to the dataset de- of features and how much that would increase the perfor-
scribed earlier in this paper. We applied downsampling of mance of readmission prediction. Figure 1 shows the result
the negative class (not readmitted), by random sampling of this experiment when we apply the classification on the
31.48% of the negative class. The reason for this is a class original data set. We performed the same experiment with
imbalance problem with our data. For the case of decision the dataset after removing the outliers. In our model evalua-
tree classifier , we optimized the tree based on different com- tion with the original data set, we found that the performance
binations of maximum depth and minimum split size. The of frequency-based feature selection is higher than gini in-
training portion of the dataset (provided by cross valida- dex (Figure 1) in predicting whether or not readmission will
tion) was divided to 10% and 90% of samples. The down occur in 30 days after discharge. However, the gini index
sampling was applied on the training set (90% of samples) seems to outperform the frequency-based feature selection
and the test set was left unchanged in order to keep the on the dataset after removing the outliers (Figure 2). In to-
same distribution of class labels as the original data. Ta- tal, given that the experiments are executed with different
ble 2 summarizes the results of our prediction performance feature space and different classification algorithms, we ob-
assessment obtained by different feature selection methods serve less significant different in performance than consider-
and different classifiers. The first two columns of this ta- ing different datasets. This means that modeling efforts for
ble shows the mean accuracies area under the ROC curve, prediction of readmission should be more focused on char-
AUC, over the datasets as reached by the different combina- acterization of type of patients and acquiring information or
tions in 10-fold cross-validation for Naive bayes and deci- knowledge about mechanisms that cause different types of
sion tree classifiers. The effect of domain knowledge feature readmissions.
reduction on the classification performance was investigated
and presented in the last row of this table. The conceptual Discussion
grouping of medical classifications for drug codes, diagnos-
tic codes, and procedure codes results in the total number of Conclusion and future work
63 features. For the purpose of a fair comparison between It is widely argued that cost of readmission is a huge burden
all feature selection methods, in this set of experiments we on health systems as well as patients. There is currently a
have selected the same number of features from the top gini critical need for methods that can increase our understand-
index and frequency rank. We generally observe only lim- ing of what is important in predicting readmission risk. In
ited differences between the feature selection method of gini this paper, we used machine learning methods for supervised
indexing, frequency ranking and domain-knowledge-based classification to predict hospital readmission within thirty
Table 1: Distribution of prescription drugs among patients with versus without readmission record in 30 days after discharge.
AHFS Class Not readmitted Readmitted
Anti-infective Agents 59668 (2.2%) 12295 (2.3%)
Antineoplastic Agents 13824 (0.5%) 2569 (0.5%)
Autonomic Drugs 156605 (5.7%) 38808 (7.1%)
Blood Formation, Coagulation, and Thrombosis Agents 103769 (3.8%) 23818(4.4%)
Cardiovascular Drugs 790439 (28.7%) 145851 (26.8%)
Central Nervous System Agents 579941 (21%) 109657 (20.2%)
Electrolytic, Caloric, and Water Balance 293973 (10.7%) 60994 (11.2%)
Eye, Ear, Nose, and Throat (EENT) Preparations 58968 (2.1%) 10093 (1.9%)
Gastrointestinal Drugs 158813 (5.8%) 33174 (6.1%)
Hormones and Synthetic Substitutes 281209 (10.2%) 53604 (9.9%)
Miscellaneous Therapeutic Agents 74350 (2.7%) 13645 (2.5%)
Respiratory Tract Agents 8002 (0.3%) 2114 (0.4%)
Skin and Mucous Membrane Agents 24893 (0.9%) 4721 (0.9%)
Smooth Muscle Relaxants 21421 (0.8%) 4826 (0.9%)
Vitamins 70253 (2.5%) 12836 (2.4%)

Table 2: Performance of readmission detection algorithms in terms of area under the ROC curve.
Feature Reduction Naive Bayes Decision Tree Outliers Cleared
Gini Indexing 0.65 0.64 0.84
Frequency Ranking 0.67 0.67 0.83
Domain-Knowledge-Based 0.65 0.63 0.82

Figure 1: Performance comparison between two statistical Figure 2: Performance comparison between two statistical
feature selection methods in terms of area under the ROC feature selection methods in terms of area under the ROC
curve, AUC, of prediction of readmission applied on hospi- curve, AUC, of prediction of readmission applied on hospi-
tal admission dataset. tal admission dataset after removing outliers.
Table 3: Descriptive statistics of the hospital admission dataset in Montreal, QC
Feature Not readmitted (n=534,869) Readmitted (n=84,405) Outliers
Gender-Male 47% 49% 48%
Age 76.7 years 77.0 years 76.7 years
Emergency admission 81% 91% 83%
Length of stay 13.9 days 14.3 days 13.6 days
Previous readmissions 0.40 0.94 1.24

days of hospital discharge. Our results suggest that prescrip-


tion medications, diagnostic information and information on
procedures during hospital admission can predict the chance
of hospital readmission. Our findings suggest that Medical
taxonomies that provide conceptual grouping for pharmaco-
logical, diagnostic, and procedural codes can be used as a
way of dimensionality reduction in order to overcome the
computational issues of encoding these factors in predictive
models for readmission. Our results show no significant per-
formance loss using this approach compared to statistical
methods for feature selection when the number of features
are the same. Our early results in the area of identifying
readmission outliers, confirm that all readmission records
should not be treated equally.
We observed some differences in a small number of de-
scriptive statistics regarding the prediction outliers. We in-
tend to extend our analysis further on this group of patients.
In the future, we plan to extend this research as we encode
additional feature reduction algorithms and run additional
experimental studies. Consensus has not been reached as to
what time frame should be used in defining a readmission,
therefore we will investigate shorter and longer readmission
time frame as a criteria in future experiments. Future studies
should assess the relative contributions of different types of
patient in readmission prediction. and the causality for read-
mission.

View publication stats

You might also like